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Prefaces 


Preface to the First Edition 


The last few years have brought with them a happy increase in rapport 
between the economic theorist and the managerial economist. This de- 
velopment has involved their simultancous realization that business 
practice can be a fertile source of more abstract analytical ideas and that 
the theorist’s rigorous tools can make an important contribution to the 
analysis of applied problems. That, in essence, is the spirit in which this 
book was written. 

The subject of this book is economic theory, not operations research. 
The volume is intended to offer the reader both a systematic exposition 
of received microeconomic analysis, and an intuitive grasp of the many 
recent developments in mathematical economics that have too long re- 
mained a mystery in the private possession of the specialists (who, it must 
be admitted, have always been willing and anxious to share their secrets). 
The discussions of applications of economic theory to the tools of operations 
research and to business analysis are primarily illustrative, and though a 
considerable portion of the body of operations research equipment is 
described, the result can by no means be considered to constitute à survey 
of the field. As one reader has suggested, this book is intended to be more 
helpful to an operations researcher who wishes to learn economies than to 
an economist who desires a systematic education in operations research. 
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xviii Prefaces 


For their helpful comments and suggestions on all or part of the manu- 
script I must thank Forman Acton, Wroe Alderson, S. T. Beza, W. W. 
Cooper, Robert Dorfman, Ralph Gomory, Herman Karreman, Robert 
Kuenne, Harold Kuhn, Don Patinkin, Gardner Patterson, Maurice Peston, 
and, above all, Alvaro Lopez and Richard Quandt. The devoted labors of 
my research assistant, Charles Frisbie, and the extraordinary workmanship 
of my secretary, Mrs. C. B. Brown, were of immeasurable help. The role 
of my several years’ experience with the management consulting firm of 
Alderson Associates, Inc., will be apparent in many parts of the book. T 
must also acknowledge my sincere gratitude to the Ford Foundation, 
whose grant to the Department of Economies at Princeton helped to 
finance both the research involved in the more original portions of this 
volume and the typing of the manuscript. 

Finally, I must thank the editors of the several journals involved, as 
well as my co-authors, Ralph Gomory and Philip Wolfe, who graciously 
permitted me to reprint portions of the following articles: “On the Role 
of Marketing Theory,” Journal of Marketing, Vol. XXI (April, 1957); 
“Selecting an Appropriate Model for an Operations Research Problem," 
Vol. VIII (November, 1955), “Solution of Management Problems Through 
Mathematical Programming," Vol. IX (May, 1956), Operations Research 
Applied to Marketing Problems," Vol. X (March, 1957) and “A Guide to 
Operations Research Methods," Vol. X (April, 1957), all in Cost and Profit 
Outlook; Community Indifference," Review of Economic Studies, Vol. XIV 
(1946-47): “On the Theory of Oligopoly,” Economica, Vol. XXV (August, 
1958); “Marginalism and the Demand for Cash in Light of Operations 
Research Experience,” Review of Economics and Statistics, Vol. XL (August, 
1958); (P. Wolfe co-author), “A Warehouse-Location Problem,” Operations 
Research, Vol. 6, No. 2 (March-April, 1958); “Economic Theory and the 
Political Scientist,” World Politics, Vol. VI (January, 1954); “Activity 
Analysis in One Lesson.” American Economic Review, Vol. XLVIII (De- 
cember, 1958); (R. Gomory co-author), “Integer Programming and 
Pricing,” Econometrica, Vol. 28 (1960); and ""The Cardinal Utility Which 
Is Ordinal,” Economic Journal, Vol. LXVIII (December, 1958). 


Preface to the Second Edition 


No doubt it is in the nature of things that revised versions of books 
appear as “second edition—expanded.” This book is no exception. —— 
two chapters from the first edition have been expunged, on balance the 
book has grown at a rate not too dissimilar to the GNP. 

There have been only a few minor changes 1n the text itself, most 


notably an attempt to improve the explanation of the basic theorem of 
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lincar programming. A substantial number of exercises have been added, 
and brief discussions of the applications of differential calculus to standard 
economic theory have been inserted at the ends of Chapters 9, 11, 13, and 
14. The five new chapters include one on duality, one on linear program- 
ming and production, one on statistical problems in demand estimation, 
and two on capital theory and its applications. The duality chapter and the 
chapter on demand estimation and its appendix provide elementary ma- 
terials that are, I believe, particularly difficult to find elsewhere. This is 
especially true of the latter, which discusses some of the econometric 
techniques for dealing with simultaneous equation problems and treats 
such subjects as least squares bias, identification, and simultaneous 
equation estimates in an intuitive manner. 

As usual I find myself deeply indebted to a number of persons for their 
very substantial help in the preparation of this second edition—to my 
colleagues Harold Kuhn, Burton Malkiel, Richard Quandt, and Frederic 
Scherer for their many suggestions, to Robert Bushnell for his revision of 
the chapter on computers, to Edward Pearsall for proofreading and super- 
vising the preparation of the new diagrams, and to Mrs. C. B. Brown for 
her superb workmanship in the preparation of the manuscript. To all of 
these I am most grateful. To those others whose assistance has momentarily 
slipped my mind, I can only apologize. 


Preface to the Third Edition 


This edition differs from its predecessor largely in the addition of some 
fairly extensive materials on the Kuhn-Tucker Theorem, including a discus- 
sion of a number of its important applications in economics. Several exer- 
cises using these materials are intended to demonstrate how this powerful 
theorem cen be applied to obtain qualitative results in economic analysis. 
In addition, the discussion of the simplex method has been modified and, I 
hope, improved, at several points. 

Professors A. W. Tucker and H. W. Kuhn were extremely generous in 
helping me at various stages in the revision. Thus they must get the credit 
not only for the substance of the new material, but for drawing, most gent- 
ly, to my attention several weaknesses in the presentation in an earlier draft. 

I am also heartily indebted to the many students who over the years 
have made a sport of catching errors in the book. I do not delude myself, 
ee a haw dipim of mistakes. I grow increasingly con- 
multiplies, so that no so initia hea "M it reproduces itself and 

, oner has one generation of errors been brought 


under control than it is replaced by a hi 
St of 
apparently from nowhere. , ali ini 
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Finally I want to thank for their help in the final task of preparation of 
the new edition the several very competent secretaries, who were more 
patient with me than I deserved, and Mr. Stephen E. Kagann, who 
conducted so capably the latest hunt for inaccuracies. 


Preface to the Fourth Edition 


If this book had feelings, no doubt—like GBS on his 90th birthday—it 
would be somewhat surprised to find itself alive and apparently well in its 
fourth metamorphosis. I, too, have been surprised and delighted by the 
increasing frequency with which younger colleagues throughout the pro- 
fession tell me that they have been subjected at some time to its materials; 
happily, as far as I could tell, the experience did not elicit their lasting 
resentment. 

When it was first written, the book was intended to guide readers to 
the frontiers of economic analysis. This new edition represents continued 
dedication to that goal. Frontiers have a way of moving, and the contents 
of the volume have had to change accordingly. I have added discussions of 
a variety of what I believe to be important materials on topics such as th2 
duality analysis of consumption and production (including Shepherd’s 
lemma), the Ramsey-Boiteaux theorem on quasi-optimal pricing under a 
budget constraint, properties of quasi-concave utility functions and their 
relationship to ordinal theory, and the reswitching debate in the Cambridge- 
Cambridge controversy. Many of these have never before appeared in a 
textbook or have been dealt with only cursorily. Although some of the new 
materials are, in the nature of the case, somewhat more difficult than the 
discussions of the standard theory, they have been tested in classes by 
myself and others and, as far as I can judge from both written and oral 
comments and from examination results, they have passed the test of 
comprehensibility. 

The new edition contains two essentially new chapters—one on com- 
parative statics (Chapter 13) and the other on duality theory in consump- 
tion and production (Chapter 14). In addition, five chapters have been 
revised extensively—those on the neoclassical theory of consumption and 
production (Chapters 9 and 11), the chapter on welfare theory (21), and 
the chapters on distribution (24) and capital theory (26). Finally, the 
organization of the book has been revised on the basis of economic area 
covered, rather than on the degree of novelty of the materials. 

As always, my debts are great, and words are the only coin I have to 
offer in repayment. 

My greatest debts, for painstaking reading and detailed and invaluable 
comments, are to Elizabeth Bailey, David Folkerts-Landau, Lester Lave, 
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and Jerome Hass. Their suggestions added to my labors, but that additional 
effort was thoroughly worthwhile. 

In addition, I received very useful comments on all or part of the 
manuscript from Sebastian Arango, Alan Blinder, Michael Rothschild, 
Vu Viet, and the members of my first-year graduate class in microeconomics 
at Princeton in the Fall of 1974. 

I was helped in the task of revising the reading lists by Roger Klein, 
Wassily Leontief, Charles McCallum, Janusz Ordover, Richard Quandt, 
and Andrew Schotter. To all of them I offer my sincere thanks. 

Finally, and most strongly, 1 must express my appreciation to Sue Anne 
Batey, my research assistant and secretary at Princeton, for her intelligent 
assistance, her ingenuity in grasping the intent of my unintelligible inten- 
tions, her ability to bring order out of chaos, and, above all, her qualities 
as a human being. 


W.J.B. 
Princeton and New York Universities 
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Optimization 
and an Example 
from Inventory Analysis 


1. Optimization—a Basic Viewpoint 


One oí the hallmarks of the economic theorist’s (and the operations 
researcher’s) approach to the analysis of business behavior and business 
problems is the concept of optimization. In business practice it is common 
to see management’s decisions made on the basis of some set of fixed 
numbers which are meant to represent the extent of the opportunities open 
to the firm. For example, businessmen frequently arrange for market 
surveys to estimate how much of their products they will be able to sell in 
the next year or some other period in the future. On the basis of such 
figures, which management seems to treat as fixed constants (under some 
such name as “market potential”), it decides how much raw material to 
put into inventory, how many salesmen to hire, etc. 

‘This sort of reasoning is the antithesis of the approach of the economic 
theorist and the operations researcher. In their analyses, one starts from 
the position that there is no one fixed amount of any commodity which 
buyers are prepared to purchase. Rather, sales will depend on price, ad- 
vertising expenditure, and a host of other variables whose values may be 
under the businessman’s control. For this reason, the number of salesmen 
to be hired should not be based on any fixed estimate of future sales, for 
the size of the sales force helps, in turn, to determine the sales volume. 

Instead of a fixed sales figure, optimality analysis therefore deals with 
an array of possibilities, often infinite in number. Which of these possibili- 
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4 Optimization and an Example from Inventory Analysis Chapter 1 


ties will in fact occur depends on the decisions made by the executives in 
question. The analyst, then, does not confine his analysis to a single possible 
decision, treating it as though it were the businessman’s only option, 
because ordinarily he will have a wide set of choices open to him, any 
one of which may permit him to stay in business or even to prosper. He 
may, with relative impunity, surely spend somewhat more or somewhat 
less on advertising, make an upward or downward change in the size of 
his sales force, in his inventory levels, and often in his prices, though the 
effects of these alternatives are rarely investigated in the standard market 
survey. The approach of optimality analysis is to take these alternatives 
into account and to ask-which of these possible sets of decisions will come 
closest to meeting the businessman’s objectives, i.e., which decisions will 
be best or optimal. 


2. Optimality Analysis in Operations Research 


The foregoing does not mean that in applied operations research work 
the analyst even pretends to be able to find the best of all possible decisions. 
The data are too inaccurate, the tools of analysis are often too blunt, and 
the operations researcher’s acquaintance with the details of the firm’s 
operations and his general business “know-how” are usually too limited 
for him to be able to come up with anything more than approximations to 
the ideal of the true optimum. Nevertheless, an analysis which is specifically 
designed to look for optimal decisions, crude and approximative though it 
may be, is very likely to do much better than the workable but relatively 
arbitrary rules of thumb of obscure origin which play so prominent a part 
in business practice. 

It is easy to provide illustrative examples of these standard business 
decision rules: 


1. Inventory levels. The quantity of any product which company X 
carries in inventory is kept (approximately) equal to the amount which 
its customers normally buy in sixty days or some other such fixed period. 

2. Pricing. The price of any of company X’s products is set at its cost 
per unit plus a standard fixed percentage “mark-up.” 

3. Advertising budgeting. A fixed per cent of the firm’s revenues (sales) 
is more or less automatically set aside for advertising. 

These crude rules often exhibit serious shortcomings. For example, we 
will see later in this chapter that the inventory rule of thumb (1) is likely 
to result in excess inventories of some items and insufficient stocks of others, 
and in later chapters it will be shown that the pricing rule (2) is unlikely 
to maximize profits or sales or anything else which the businessman may 
be expected to consider important. 
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Most businessmen recognize these rules of thumb for what they are— 
rough but serviceable management tools. The operations researcher, by 
systematically seeking to determine the very best of the available possi- 
bilities, may at least hope to do better than the old standard rules of thumb. 


3. The Role of Optimality in Economic Analysis 


The economist’s interest in optimization is of another sort entirely. At 
least in part his position, relative to that of the operations researcher, is 
somewhat analogous to the physicist’s relation to the engineer. A primary 
aim of the economist is to understand business behavior rather than to 
make recommendations to businessmen. His understanding of economic 
processes provides part of.the foundation for the analysis of the operations 
researcher. 

The concept of optimality is important to the economist for his analysis, 
theoretical and applied, of public policy problems; but it also helps him to 
understand the behavior of businessmen, consumers, and other members 
of the economy. It is at least possible that sheer business acumen and 
experience permit management and other economic units to arrive at 
decisions which come close to being optimal. Moreover, in business, com- 
petition may soon eliminate firms whose decision-making is consistently 
poor. To the extent that these assertions are valid, optimality analysis 
should serve as a relatively good predictor of economic behavior; that is, 
it should provide a reasonably good explanation of actual economie deci- 
sions and activities. In economic theory it is therefore customary to employ 
an optimality premise in discussing the behavior of firms, consumers, and 
other economic units. It is simply assumed that these units' decisions are 
approximately optimal, and the consequences of this assumption are then 
usually presented as & rough description of economic behavior in the real 
world. Thus, in effect, the economist tells us only what a rational individual, 
who is also a well-trained and efficient calculator of optimal decisions, 
would do in his economic activities. 

Because of this orientation of so much of economic analysis, the theory 
of optimal decision-making will constitute & central theme of this book. 


4. Illustration: A Simple Inventory Problem 


The reader may well feel, with some justification, that he has always 
believed in optimal decisions and that the concept involves relatively little 
that is new. Two aspects'of the approach, however, are likely to be novel. 
The first is the explicit consideration of the entire relevant range of pos- 
sibilities. Rather than considering whether the firm can maintain its posi- 
tion with a $2 million advertising budget, we try to examine the effect of 
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each and every possible budget, say between $.5 million and $4 million. A 
second feature of the optimality caleulation which is apt to be novel is the 
drawing together of these materials into a more systematic and rigorous 
analysis. These two aspects of optimization are, perhaps, best brought out 
by illustration. For this purpose let us examine the simplest (and, therefore, 
the crudest) of the models of inventory analysis. It is to be emphasized 
that this model is selected only for expository purposes; the reader must 
keep in mind that for this reason it is necessary to ignore many crucial 
features of real inventory problems. He should notice, however, how far 
the analysis carries us on the basis of very little initial information. It 
pulls implications from the model much as a magician pulls rabbits out of a 
hat—we know that the rabbits must have been there to begin with, but 
their presence was by no means obvious, and the skill with which they 
are produced is often impressive. 

Our analysis deals with a retailer who (perhaps on the basis of contracts) 
confidently expects to sell some fixed amount, call it Q* units, of one of 
his commodities over the next year at a predetermined price, with demand 
spread evenly throughout the year. How much inventory should he keep 
on hand? He has considerable choice in the matter. For example, if Q* — 
100,000 units he can meet his demand by having the entire amount de- 
livered to his warehouse at the beginning of January, keeping it in stock 
until it is gradually depleted by shipments to his customers; alternatively, 
he can have 50,000 units delivered to him right after the first of the year 
2nd another equal amount on July 1. Still another alternative is to have 
four quarterly deliveries of 25,000 units, and so on. 

Now the first alternative (receipt of the whole amount at the beginning 
of the year) involves an inventory which begins with 100,000 units and 
ends with zero, so that his average inventory is 50,000 units. Similarly, 
the second (two-delivery) procedure involves inventories which begin 
with 50,000 units and end with zero, so that in this case the average stock 
on hand is 25,000 units, etc. Thus, by ordering more and more frequently, 
the required average inventory level can be made smaller and smaller. 

Here, then, is the range of possibilities which our optimality analysis 


1 An asterisk is written after the Q to indicate that this letter represents a definite 
number which is known to the firm. This convention will be used throughout this section 
to distinguish such numbers from the variables whose values are the unknowns of the 
analysis. 

? Of course, in practice it is normally never planned to have inventory run out al- 
together. For unexpected demands or delays in deliveries could then embarrass the 
businessman who had no stocks on hand to service the waiting custome:s. In the analysis 
which follows the reader can, therefore, if he wishes substitute some minimum inventory 
quantity M* for this zero whenever it appears. He will find that no change in the analysis 


resulta. 
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must consider: The basic question is, how far should this process of cutting 
down on inventory be carried? A smaller inventory, of course, saves money 
on inventory carrying costs: that is, on storage costs, interest cost on the 
cash used to buy the inventory, etc. But, on the other hand, there is & 
reorder cost involved in placing and delivering an order, and since a smaller 
inventory involves more frequent orders and deliveries, if management 
decides on too small an average inventory level these costs may become 
prohibitive. Determination of the optimal inventory level involves a 
systematic balancing of the savings in inventory carrying costs against the 
increased reorder costs which reduced inventory will require. 


5. Determination of the Cost Relationship 


To find vhe optimal inventory level (the level which does the job at 
minimum cost) we must now go through the rather painful process of 
finding mathematical expressions for these two types of costs: 


1. Carrying cost. We saw in our example that the average inventory 
level is one-half the amount received in a shipment. Thus, in general nota- 
tion, let the quantity delivered to our retailer be D units per shipment 
(if it is all delivered in January, D — 100,000 units in our example). Then, 
as has been assumed, if demand is spread evenly throughout the year, 
inventory would fall at a steady rate from the day it is delivered until it 
is used up. Thus the inventory must fall gradually from D to zero so the 
average inventory level must be 


D+0_D 
2 “o 


Now let k* (dollars) represent the interest and other carrying cost 
involved in holding one unit of inventory for one year. Then the total 
carrying cost will be the annual carrying cost per unit times the (average) 
number of units in inventory = k*D/2. 

2. Reorder cost. If 100,000 units are to be sold and 25,000 units are 
delivered per shipment, then clearly 4 = 100/25 deliveries will be required 
over the course of the year. More generally, if Q* is to be sold over the 
course of the year and D is delivered each time, the required number of 
deliveries is Q*/D. 


Suppose, moreover, that the cost per delivery is related to the amount 
delivered by the expression a* + b*D where a* and b* are some numbers. 
Here b* may be interpreted as the shipping cost per item so that the cost 
of sending D items is b*D dollars. Similarly, a* represents costs such as 
bookkeeping and long-distance telephoning for orders—in other words, 
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costs whose magnitude js not seriously affected by the amount involved 
in the shipment. 


We can now calculate the total annual reordering cost; it will equal the 
number of deliveries multiplied by the cost per delivery, i.e., 


(a* + b*D)Q* = a*Q* b*Q*D " a*Q* * 
D Ee p I i 
The total cost which our retailer lays out on his inventory is the sum 
of these two costs: the carrying and the reorder cost. It is therefore equal to 


k*D a*Q* -ai 
3 "PDU + b*Q*. 


This is the relationship which we have been seeking. 


C= 


6. The Optimality Calculation 


Let us pause now to examine what has so far been accomplished. In 
effect, the only unknown in the preceding equation is the (optimal) vaiue 
of D, the amount to be delivered per shipment. Once this number is deter- 
mined the entire problem is solved, because we can automatically know 
the corresponding average inventory level (= D/2) and the number of 
times per year shipments should be ordered (= Q*/D). 

But once we have found our equation, the solution of the problem is 
reduced to a simple problem of computation, for the equation gives us a 
direct relationship between costs and the alternative values of our variable, 
D. For example, suppose the numbers in the equation were Q* = 100 
(thousand), k* = 8, a* = 60, and b* = 3. Then the equation becomes 


or 
c = ap 49 + oon 


With such an equation the optimal value of D can be approximated 
by a number of trial calculations. One can simply take a number of alterna- 
tive values of D, substitute them in turn into the equation, and compute 
the corresponding values of C, thus finding, roughly, the value of D which 
gives the lowest cost. For example, setting D = 10 (thousand units) we 
obtain C = 40 + 600 + 300 = 940, and similarly, when D = 20, C = 680, 
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and so on, as shown in the following table: 
D 10 2 30 40 50 60 70 80 -= 


C 940 680 620 610 620 640 666 695 --- 


Examination of this table readily suggests the (correct) conclusion that 
the optimal value of D is approximately 40 (thousand) units per delivery. 

Thus, by finding the inventory cost equation we have obtained all the 
information required for the solution of our problem. An equation of this 
variety is called an objective function because it shows how the firm’s 
objective (cost minimization) is affected by the différent values of the 
variable in question. We shall encounter such objective functions through- 
out this book. 

In effect, then, the inventory problem has now been solved. However, 
an additional bit of mathematical analysis will enable us to extract a 
great deal of additional information from this solution. The standard 
methods of the differential calculus (Chapter 4) can be used to obtain 
from our cost equation another equation which gives us the optimal value 
of our variable D. This equation is* 


2a*Q* 
D =q] pe 


This result gives us the optimal average inventory level D/2 and the 
optimal reorder quantity, D, corresponding to any levels of sales volume 
Q*, unit carrying cost k*, and fixed reorder cost a*. The result is, therefore, 
not tied to any particular numbers such as Q* = 100, k* = 8, etc., as was 
our numerical computation. As a result, this equation can be used to see 
what happens to the optimal value of D when some of these numbers 
change. It can readily be seen to indicate, as might be expected, that the 
optimum inventory level D/2 should be increased when sales Q* go up. 
It also calls for an increase in the size of each delivery D (a reduction in 
the number of deliveries) when the reorder (delivery) cost a* increases. 


3 Proof: The optimal value of D is that which minimizes total inventory cost, C. We 
therefore differentiate C with respect to D, set the derivative dC/dD equal to zero, and 
solve for D. We obtain 

dG E  a*Q* 


k* a*Q* 
db 2 I EK HUE ST 


Multiplying both sides by 2D*/k* we obtain 


2a*Q* 2a*Q* 
D = T or D=4] PU 
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Similarly, inventory should be reduced if the carrying cost k* goes up 
(because k* appears in the denominator of the fraction) .* 

More surprising, and perhaps more important, is that this formula 
indicates inventory should increase only in proportion to the square root of 
sales. In other words, if sales of some item double, inventory should not 
be doubled—it should be increased to much less than 200 per cent of its 
original amount. As was mentioned earlier in this chapter, many firms fix 
their inventory at some constant percentage of sales volume (a fixed 
number of weeks’ worth of sales are kept in inventory) so that if one item 
sells five times as much as another, they will tend to keep five times as 
large an inventory of the former, which, as our result shows, means that 
they are keeping too much of the former, too little of the latter, or both. 
In fact, substantial savings have often been achieved because the last 
equation and related results have led analysts to recognize that the stand- 
ard rule of thumb tends to yield excessive inventories of the popular, 
large-sales-volume items and insufficient inventories of the goods whose 
sales are relatively modest.° Thus we see that even with our highly over- 
simplified inventory model an optimality analysis can, if used with 
sufficient caution, produce significant practical results. 
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mRNA MM OR EMEN CREME EMEN: 
larger-volume goods an inventory which is not nearly as high in proportion to its sales 
may be expected to provide an adequate reserve against most unforeseen demands. 
Thus, here again, the optimal size of inventory will normally increase less than pro- 
portionately with sales volume. 


Some Elementary 
Mathematics 


2 


This chapter provides an explanation of a number of elementary 
but fundamental mathematical ideas. It discusses the meaning of and 
notation for a function of one and of many variables, the equation of a 
straight line and a few other simple relationships, the definition of "slope," 
the definitions and elementary rules of manipulation of exponents and 
a matter of notation—the >> (sigma) representation of a sum. All of these 
concepts appear frequently in the literature of economics and occur later 
in the book. The reader who is not sure of himself on these topics would 
therefore do well to master this material before proceeding; he should not 
find it difficult. Readers who are familiar with these concepts clearly need 
not waste their time on this chapter. 


1. Functions 


The expression y = f(x), which is read ‘‘y is a function of x” (and does 
not represent some number z multiplied by another number, f), means 
that there is some, perhaps unspecified, relationship between the values of 


* The material in this chapter is far more rudimentary than the contenta of the chap- 
ters which follow. However, many readers may have forgotten the logic behind such 
fundamental concepts as linear equations, negative exponents, etc. This material is 
therefore presented for review to facilitate the reading of some subsequent portions of 


this book. 
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the variables y and z. That is, for some values of z, such a relationship 
specifies corresponding values for y. Such & functional relationship is 
summarized in the following table: 


This states that when z = 1 then y = f (1) = 15. Similarly, the value of 
y which corresponds to z — 5, ie., f(5), is equal to —22, etc. Basically 
then, y = f(r) is a symbol which represents such a table of values. 
Sometimes it may represent a specific algebraic relationship such as 
y = 152? log x + 3 from which we can compute the corresponding table 
of values, but this will not always be the case. 

There are many economic examples of such functional relationships. 
Thus, some simple models contain a demand function of the form Qa = f(P) 
which states that Qa, the quantity demanded of some commodity, depends 
on P, the price of the item. If it is desired to introduce a second functional 
relationship (e.g., a supply function) which is to be distinguished from a 
functional relationship that was previously introduced, then other symbols 
such as F, g, à (the Greek letter phi), or fi may be used instead of f. Thus, 
the supply function might be written Q, = g( P), where Q, represents the 
quantity of the commodity which is supplied. 

The quantity of the commodity which is demanded may, and in fact 
does, also depend on the values of variables other than price. For example, 
it may depend on the level of consumer income, Y, and on the volume of 
advertising expenditure, A. This is a multivariable demand function which 
is written symbolically as Q = f (P, Y, A). A somewhat more general 
notation is Q = f(z1, 22, ***, vss) which states that the value of Q is de- 
pendent on the values of 15 different variables. Here zı may represent price, 
z consumer income, etc. Still greater generality can be achieved by repre- 
senting the number of variables by the symbol n, in which case we write 
Q = f(zy Ta, ***, £4), meaning that the value of Q depends in some way 
on the values of each of some unspecified number of variables. 


2. Slope 


The slope of a line is a measure of steepness. For this purpose the follow- 
ing convention is employed: One simply calculates how much the line rises 
per unit move to the right. That is, if moving four units to the right involves a 
two-unit rise in the graph, we say that its slope is $ = 0.5, i.e., that it 
rises at an (average) rate of one-half unit as one moves one unit to the 


right. 
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This is illustrated in Figure 1. As we move four units to the right below 
straight line LL’ from point A to point B, the curve climbs two units to 
point C. Specifically, the slope of the curve is given by the increase in the 
vertical coordinates of points A and C divided by the increase in their 
horizontal coordinates. That is, it is equal to 


A line such as segment NN’ which 
goes downhill as we move to the right is 
said to have a negative slope. In this case, 
since its level diminishes two units as we 
move one unit to the right, it is of slope 
—2. 

(ae I a ee The difference between a line which is 
* straight and one which is not is that the 
slope of a straight line never changes. The 
line LL' is equally steep in the vicinity of 
point A and in that of point B. By contrast, at the bottom of curve SS’ 
the line is fairly flat, but it grows steeper as we move up along it either to 
the right or the left, so that its slope increases in numerical value. 


Figure 1 


3. Linear and Other Simple Equations 


The equation y = iz + 3 and the more general equation y = ax + b 
(where a and b are any numbers) are called linear. If the reader were *o 
use such an equation to compute a table of values for x and y, and then 
plotted these figures on à graph, he would find that he had drawn a set of 
points all of which lie along a straight line; y = ax + b is therefore called a 
linear equation. 


We can prove that this must be so 
without much difficulty. In the pre- 
ceding section it was indicated that 2 
straight line may be defined as one 
whose slope never changes. To prove 
that y = ax + b is represented by a 
straight line, we must therefore show 
that the slope of the graph is a con- 
stant (a fixed number such as a or b). 
Now consider any two points such as 
W and V (Figure 2) which lie on the 
graph of this equation, where the re- 
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mainder of the graph of the equation is not shown in order to avoid 
prejudging its shape. As we have seen, the slope ‘of this graph is 
(yı — yo) / (31 — xo). But, since both points lie on the graph of y = ax + b, 
this equation tells us that there is a relationship between yo and £o and a 
similar relationship between yı and qı: yo = ato + b and yı = axı + b. 
Substituting these expressions for yo and y, into the expression for the 
slope we obtain 


yi — yo _ (az +b) — (ato + b) 


Xi — To Xi — Xo 


slope of (y = az + b) 


an +} — en — | am — an 


Xi — Xo Xi — Xo 


_ a(t F xo) 


(zi + T0) 


Thus we have proved that the slope of the graph of y = ax + b is always 
equal to the number a, the coefficient of the term ax. For example, the graph 
of y = 6x + 3 is of slope 6, which remains unchanged throughout the 
length of the graph. Hence that graph must be a straight line, as was to 
be proved.! 

The constant, b, in our linear equation can also be given a simple 
interpretation. Consider the point on the graph where x = 0, i.e., the 
point where the graph crosses the y axis. There we have 


y=f0) =a-0+4+b=6. 


In other words, h is the y intercept—the value of y (the height of the graph) 
at the point where it crosses the vertical axis. 

It can be shown by similar arguments that the graph of a three-variable 
relationship such as y = 3x + 4z — 6 is the three-dimensional analogue 
of a straight line—a plane in a three-dimensional diagram whose three 
axes represent the values of z, y, and z. More generally, by analogy, we 
use the term linear equation for any relationship such as 


y = a124 + a282 +... + antn + b, 
where y, 21, %2, ** *, Zn are all variables and ay, a», +++, à, and b are all con- 
stants. 
The second-degree nonlinear equation y = z? + 3 has the graph SS’ 
in Figure 1, as may be verified by plotting some points. Similarly, other 


1]t is easily shown that the converse is also true, i.e., that any straight line will 
match an equation of the form y = az + b. For let a* be the slope of any such line and 
let b* be its y intercept (see below). Then this line must be the graph of y = a*z + b*, 
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B 
AA y =0x>4 bx? cx «d (cubic) 
B 


| B | y=ax"+ bx?+ex2+dx +e (quartic) 


y = ka* (exponential) 
(a >1) 
log. y=Ax +k (logarithmic) 


p" x * b(inverse exponential) 
| LU a 


“ysasinx+b —— 
(trigonometric) 


y= E (hyperbolic) 
x 


Figure 3? 


types of equation have characteristic graphic forms, some of which are 
indicated in Figure 3. Note that the number of “bumps,” B, in an nth- 
degree polynomial equation 


y = ax" + aan + +++ tant +b 


is (usually) one less than the degree, n, of the equation (though there are 
exceptional equations for which this is not true). Thus, the third-degree 
equation 

y = az? dbz +er+d 


has one hill and one valley, the quartic has one peak and two valleys, 
etc. The next graph, the exponential y — ka*, exhibits an explosive, roughly 
geometric (cumulatively increasing) growth rate. The inverse type of 
relationship y = k/a* + b can level off, with the fractional term gradually 
approaching zero. The trigonometric functions, y = a sin z + b and 
y = cos z + b, form a perfectly symmetrical and repetitious cyclical 


3 For reasons which will become clearer in Sections 4 and 5 below, exponential and 
logarithmic equations are very closely related and, indeed, one can usuaily be translated 
into the other. That is, of course, why they have similar graphs. 
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pattern. Finally, the relationship y = a/z*, of which the frequently en- 
countered rectangular hyperbola, y = a/z or zy = a, is a special case, 
approaches the axes asymptotically; that is, it comes closer and closer to 
the axes but never quite reaches them. Of course, many other types of 
relationship exist and each has its characteristic graph. 


PROBLEMS 
Show that 3y + 6z + 9 = 0 is a linear equation. Prove that its slope is —2. 
Generalize this result to explain why any equation of the form 
az + ate + +++ + aam. = k 
is linear. Here the a’s and k are any (unspecified) constants (numbers). 


4, Exponents: Definitions and Elementary Rules of Manipulation 


Negative, fractional, and zero exponents may seem puzzling at first, 
but these are all extensions of the standard definition 


z^— get... € (n times), 
i.e., x multiplied by itself n times. As a special case of this definition we 
have x! = x. 
Now suppose we multiply, say, x? by xè. By this definition, 
wee = (z:z)-(z-2-2)—-z:z:z2:z-r-—3. 
Generalizing, we obtain the fundamental 


Rowe 1. Multiplication: x* - z^ = z***, that is, the product of two 
identical terms, each raised to a different power, is equal to that same 
term, this time raised to the sum of the two powers. 

In other words, to multiply power terms we add their exponents. Àn 
analogous rule applies to division. For example, to divide xë by x? write 

a$ fee ee oe Be 


= =g. geg g = oh 
x rm 


Hence we conclude 

Rute 2. Division: x?/x^ = x. 

3 To generalize this and the arguments that follow we need merely substitute letters 
for the numbers in the illustrative “proofs.” For example, 


zog? = (xm...x)mm....-x:) = (wees... ex) = x48. 


a z's bz's (a + b) z's 
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Thus, to divide exponent terms, subtract the exponent of the de- 
nominator from that of the numerator. 

This rule seems obvious so long as a is greater than b (a > b). But it is 
also extended verbatim to cases where this does not hold, e.g., to division 
of z* by zê, which is then written z*^9 = z—?. Several significant conse- 
quences follow. If we divide z* by z*, the result is clearly equal to unity. 
But by Rule 2, z*/z*«(— 1) = z*^« = z?. Hence, 

RULE 3. Zero exponents: x° = 1, for any (nonzero) number z. 


A consequence of Rule 2 and Rule 3 can be obtained by taking the 
reciprocal of z?. This is equal to 1/z*, which by Rule 3 is equal to x°/z*, 
or by Rule 2, it is equal to x°-* = r. This, then, yields the definition of a 
negative exponent, i.e., 


Rute 4. For any z and a, the expression z~* represents the reciprocal 
of z?. 


To raise a term, z^, to a higher power, e.g., to square z?, we multiply 
. this term by itself to obtain 


z.3— (zr:z-z)(r-z-zc) = p? = z, 
More generally, 


RULE 5. Powers: x* raised to the bth power is equal to z9^, so that, to 
raise such a term to the power b, one multiplies the old exponent by b. 
Similarly, to undo this operation, e.g., to take the square root of x’, we get 

zë = z? = z9/?, More generally, 


Rute 6. Roots: The bth root of z* is zs», In particular, z!/^ is the bth 
root of z(— z!). 


5. Logarithms 


The ways of manipulating exponents just described are widely used to 
simplify computational problems. The facts that multiplication can be 
reduced to addition of exponents and that a number can be raised to & 
higher power by multiplication provide the basis for logarithmic computa- 
tion. This section is not intended to teach the reader how to use logarithms 
in computation; a much longer and more detailed exposition is required 
for this purpose. Rather, it seeks to illustrate how the results of the previ- 
ous section can be employed and to review the simple basis of the- ele- 
mentary theory of logarithms, which is often forgotten by many who use 
the device. 

Given any number k, we define the logarithm of k (to the base 10) to 
be a number which satisfies the following relationship: k = 10!e« *, That is, 
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log k is a number such that, if one raises 10 to the power log k, then the 
result is equal to k. 

Since these’ so-called common logarithms are all powers of 10, they 
can be combined according to the rules for the manipulation of exponents. 
For example, to multiply two numbers, a and b, observe that (by Rule 1 
of the previous section) 


a+b = 10lee . 1(!loz^ = ]1itog a + logò, 


Hence we conclude 

Rote 7. Logarithmic multiplication: To multiply two numbers, a and 
b, use a table of logarithms to find the number whose logarithm is equal to 
log a + log b. 

Similarly, to raise any number k to the power a we note that (by Rule 5) 


ke = (10's kja = 10° los k, 


Therefore 


Rute 8. Logarithmic calculation of powers: To raise any number k to 
any power a, find the number whose logarithm is equal to a log k. 


To illustrate the economy which these rules permit, the reader may 
wish to consider what sort of labor would be involved, without the use of 
logarithms, in calculating, e.g., 100 (1.05)?5, the value after twenty-five 
years of a $100 security which carries an interest of 5 per cent, compounded 
annually, and where the interest is allowed to accumulate. 


6. >> Notation 


A final item in this collection of miscellaneous mathematical back- 
ground material is a standard notation for addition, employed at a number 
of points in this volume. Let zi, £2, 23, and z4 be symbols used to represent 
four numbers, say z; = 5, z; = 0, z; = —2, and z, = 12. The Greek letter 
> (upper-case sigma) is used to indicate summation. The addition of 
these four numbers is then written as 


4 
Xa. 
i=l 
This means: Add the numbers z; which sre obtained, successively, by 
letting ? be equal to 1, then letting 7 be equal to 2, all the way up to ? = 4. 
This is the significance of the notation above and below the >). The 
number (called a summation index) which is below the Di indicates the 
first term to be included in the sum, and the number above the >> repre- 
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sents the last term in the sum. Several examples should make this clear: 
4 


L r: = tn ttn tt t n= 5+0 -2 +H 12 = 15. 


3 
35 -25n-cz42-540—2-3 
i=l 
and 
4 
212:— 5. -- 2; + t = 0- 2-12 — 10. 
[s] 
A slightly more subtle example is the >> representation of the power series* 
6 
1 +y +y a) + yt + yt + y which is just 27 y*. 
$-0 


Where a large number of terms is involved, this notation saves both 
space and time. 
PROBLEMS 


1. Write out in > notation 
(2) the linear equation 


y = aizi + aot, + ast + ata 
(b) the polynomial equation 


y = m + aiz + aoz?. 
2. Write out term by term 


(3) Zac 


3 
(b) 2 2. 


4 The reader will recall from Rule 3 of Section 4, above, that any number raised to 
the zeroth power is equal to 1. Specifically, yf = 1 when i = 0. 
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1. Marginal Reasoning and the Logic of Decision-Making 


There is & common element to all decision problems which is expressible 
in the apparently trivial question, “Is it worthwhile?” A firm considering 
an improvement in product quality, a consumer considering the purchase 
of a bottle of wine, or a government agency considering the organization 
of another research project must all ask the same question—whether the 
action in question will add sufficiently to the benefits enjoyed by the per- 
former to make it worth the cost. This is the heart of marginal decision- 
making—the statement that an action merits performance if and only if, 
as a result, the actor can expect to be better off than he was before. 

Although this proposition seems obvious enough, the fact that it is 
frequently violated in practice suggests that it requires some examination. 
First let us see how a decision which runs counter to this rule is likely to 
arise. Consider the following example: A manager is empowered to hire 
an additional salesman. He decides to send this man to St. Louis rather. 
than to Cleveland because last year’s orders per salesman were $60,000 in 
St. Louis and $43,000 in Cleveland. But it is possible that the difference 
in returns per salesman in the two cities occurred just because the size of 
the sales force in the former was well adapted to the number of retailers 
whereas the sales force in the latter was spread too thinly. If so, the new 
salesman may add little, if anything, to the company’s orders in the sales- 
man-saturated St. Louis market, but in Cleveland he might produce a 
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substantial increase in sales. Clearly, if the firm's objective is to maximize 
its orders, it would in this case be better to send the man to Cleveland. 

The figure giving the size of orders per salesman is referred to as the 
average return per salesman, whereas the increase in sales which results 
from the presence of an additional salesman is called the marginal return. 
The manager in the illustration was (inadvertently) acting contrary to the 
firm’s interests by sending a salesman to the city where the average return 
per salesman was higher rather than to the area where his marginal return 
(the amount he could add to company sales) was higher. A basic theorem 
of this section, then, is that the best interests of a firm, a consumer, or any 
other economic unit require that any decision take into hiccount the magnitude 
of the marginal yield which it promises. 

As will be shown in the next chapter, the marginal analysis is very 
closely related to the classical mathematical tool of optimality analysis— 
the differential calculus. A marginal datum may be interpreted, roughly, 
as a first derivative, and most of the findings of this chapter can readily be- 
translated into calculus terms as is shown occasionally in footnotes. 

Marginal analysis, because it can be explained with the aid of arith- 
metic examples, has a distinct expository advantage. But, on the other 
hand, in this form it is a relatively blunt calculating instrument to which 
many of the powerful analytic theorems of the differential calculus de- 
scribed in the next chapter do not apply. 


2. Theorems on Resource Allocation 


To emphasize further its importance for decision-making, let us now 
examine two fundamental propositions of the marginal analysis. The first 
of these is designed to determine the magnitudes of the variables which 
constitute an optimal decision: How much should be spent on newspaper 
advertising? How far should price be cut? How many pounds of plums 
should a consumer buy? The paradoxical answer to a question of this sort is 


Rute 1. Optimal activity level: The scale of an activity should if possible 

. be expanded so long as its marginal net yield (taking into account both 

benefits and costs) is a positive value, and the activity should, therefore, 
be carried to a point where this marginal net yield is zero.! 

This result is paradoxical because it suggests the question, “Why 


1 This is no more than the standard caleulus proposition that to maximize any 
y = f(z) it is necessary that the first derivative, dy/dz (the marginal effect of z on y) 
be zero. It is, however, very important to realize that this is a necessary but not a suf- 
ficient optimality condition, i.e., all optima to which it is relevant must satisfy Rule 1, 
but not all situations which meet the requirements of the rule will be optima. To sep- 
arate the sheep from the goats we must satisfy in addition the very important second- 
order conditions described in Section 9 of this chapter. See also Section 2 of Chapter 15 
for an illustrative application of this point. 


Part 1 Marginal Analysis 23 


shouldn't we quit while we're still ahead?" Why not stop advertising when 
the marginal return on à dollar of advertising is, say, $1.75? The answer, 
here, is that a firm which makes such a decision is voluntarily missing the 
opportunity to make even more money. An additional dollar spent on 
advertising will leave the firm 75 cents (= $1.75 — $1.00) ahead, and 
failure to take advantage of the opportunity therefore leaves the firm 75 
cents poorer. 

Of course, the firm may not have any more funds to lay out on ad- 
vertising. In that case the rule cannot be followed. That is the significance 
of the proviso in Rule 1 that the marginal yield of any activity should be 
reduced to zero whenever possible. As we shall see in later chapters, this 
remark lies behind the role of mathematical programming. 

Another possible objection to Rule 1 is that the firm may have a better 
use for its money than expansion of the scale of the activity that happens 
to be under discussion. A dollar spent on improved quality control may 
perhaps yield $1.97, as against the $1.75 return to an additional advertising 
dollar; therefore advertising should not be increased until something is 
done to take advantage of the quality-control profit opportunity. By itself 
this argument leads only to the conclusion that more should ultimately be 
spent on both activities. If there is enough money available, and both of 
these types of expenditure yield diminishing marginal returns, an optimal 
budget will provide enough funds to expand both activities until each of 
them has a zero marginal yield. But where funds are limited so that this 
ideal cannot be attained, we have a second fundamental rule of marginal 
analysis: 

Run 2. Relative activity levels: For optimal results activities should, 
wherever possible, be carried to levels where they all yield the same mar- 
ginal returns per unit of effort (cost). 

For, where this condition does not hold, the decision-maker is missing 
an opportunity to benefit by reallocating some of his resources from an 
activity with the smaller marginal return to one with a larger marginal 
yield. In the previous example, he can benefit by budgeting more money 
for quality control rather than for advertising. The transfer of one dollar 
from advertising (marginal yield $1.75) to quality control (marginal yield 
$1.97) must yield a clear gain of $1.97 — $1.75 = 22 cents, in effect giving 
him something for nothing! Hence, generally, unless Rule 2 is satisfied, he 
cannot possibly be getting the maximum yield from his resources; his 
resource allocation among alternative activities can only be optimal if all 
his expenditures yield the same marginal return. 


3, Totals, Averages, and Marginals: Their Arithmetic Relationships 


It is customary to explain the arithmetic of marginal analysis with the 
aid of a table such as Table 1. 
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TABLE 1 


———————————————————————————O 


No. of Unita Total z Average z Marginal z 


0 0 = 0 
1 80 80 80 
2 180 90 100 
3 270 90 90 
4 280 70 10 
5 250 50 —30 


The last three columns have been labeled “Total,” “Average,” and 
“Marginal x” to indicate that these are purely arithmetic relationships 
which are valid whether we are interested in total, average, or marginal 
revenue, cost, profit, or utility. That is, the reader can substitute any of 
these words for the letter z and end up with a valid table. 

One may, perhaps, get a better grasp of the relationship if the numbers 
are, for the moment, interpreted as the examination grades of a group of 
five men. The first row indicates that before any of the papers has been 
marked the total grade is zero. The marginal grade is then, by convention, 
also considered zero (nothing has yet been added to the total), but there 
is no meaning to the concept of “average grade” when there are no papers. 
The first paper, however, gets an 80 (marginal grade = addition to total 
grade = 80) so that the total of all grades so far is 80, and the average 
grade is also 80. The second paper gets 100 (marginal grade 100) which 
brings the total up to 180 and pulls the average up to (80 + 100) /2 = 90, 
and so on.? We see, then, that the marginal figure is defined as the amount 
which is added to the total by each additional paper. The average (arithmetic 
mean), of course, has the usual connotation: the total divided by the 
number of units. 

Three relationships emerge from this illustration, each of which will 
play some role in the subsequent discussion: 


Route 3. First units: The total, average, 


and marginal figures for the 
first unit are identical (so long as total z for 


the zeroth unit is zero). 

For example, the total, average, and marginal grades of the first paper 
in Table 1 are all 80. Similarly, if a farmer sells one cow for $50, his mar- 
ginal revenue (the addition to his income produced by selling one cow) is 
$50, his revenue per cow sold (average revenue) is $50, and his total revenue 
is aleo $50. An exception to this apparently inviolable rule will emerge in 
Section 6 of this chapter. 


3 The fifth man must have done something unusual, since his paper receives a grade 
of minus 30! 
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Rute 4. Relation between total and marginal figures: The total figure is 
always the sum of the preceding marginal figures. 


For example, the total grade of the first three papers, 270, is the sum 
of the grades of these papers, 80 + 100 + 90. This rule follows from the 
definition of the marginal figure as the addition to the preceding total 
figure (the current total is the sum of the preceding increments to it). 
Finally, we have the somewhat more complicated 


Rue 5. Relation between average and marginal figures: For the average 
to rise, the marginal figure must be above the average figure; for the aver- 
age to remain unchanged, the average and marginal figures must be equal; 
for the average to fall, the marginal figure must be below average.* 


For example, in Table 1, when the average figure rises from 80 to 90, 
the marginal figure, 100, lies above the corresponding average figure, 90. 
The reason is easily seen—the average of a group of grades can only be 
raised by adding a paper whose grade is above average. 

For similar reasons, the table shows that when the average remains 
unchanged at 90, the latest grade (the marginal figure) must itself be just 
average, i.e., 90. And when the average grade then falls to 70 it must have 
been pulled down by a below-average grade, i.e., 10. 

This last rule is often misunderstood to state that, when the average 
figure falls, the marginal must also fall; when the average remains un- 
changed, the marginal must remain unchanged, etc. Such a conclusion is 
false. For example, while the average figure remains unchanged at 90 
when the third paper is graded, the marginal figure falls from 100 to 90. 


4. Geometry of Marginal Analysis: Total x Curves 


The preceding rules help us to find geometric relationships among the 
average, marginal, and total concepts. First, let us draw a graph describing 
the total profitability of advertising expenditure (Figure 1) and show how 


3 A general proof of this result can also be obtained with the aid of a little differential 
calculus: 

Let Q represent quantity, or number of units (the item in the first column of Table 1) 
and let A be an average figure. Then the corresponding total figure is AQ (e.g. total 
revenue equals average revenue multiplied by the number of units sold). The marginal 
M, is the derivative of the total figure with respect to the independent variable 


figure, 
Q, ie., 

dA dA 
a) M = zi =A+Q W 


From this equation Rule 5 follows readily. For example, when the average figure is 
rising, we have dA/dQ > 0, and it follows that M > A, etc. 
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the corresponding marginal and average figures can be read off from it. 
This total profit curve, as its name implies, indicates how total profits will 
vary with the firm’s advertising expenditure. It plots data like those in 
column 2 of Table 1 against the information in column 1. For example, 
point A in Figure 1 indicates that $3 million spent on advertising will 
yield a total of $4 million in profit to the firm. The average profit (profit 
per advertising dollar) is then $4 million/$3 million = $1.33, approxi- 
mately. But, in this ratio the $4 million is represented by line segment RA, 


TOTAL 
PROFIT 
(MILLIONS) 
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Figure 1 


while the $3 million is depicted by OR. Hence, the average profitability of 
expenditure OR is RA/OR, which is the slope of OA, the straight line from 
the origin to point A. In other words, we have 


Rurz 6. Average x and total curves: Given any point A on a curve 
representing total z, the corresponding average z figure at that point is the 
slope of the straight line, OA, which connects point A with the origin. 


This rule permits us to conclude by inspection that between points A 
and D the average profitability of advertising has risen, for the slope of 
line OD is clearly greater than that of OA. In fact, the point of maximum 
average profitability is D, the point of tangency between the total profit 
curve, OP, and the straight line OD, to the origin. The reader may readily 
check that at points to the right of D, such as M, the average profit will 
be lower than it is at D because the corresponding line segment through 
the origin, OM, is less steep than OD. 

The marginal productivity of advertising at point A is also measured 
by a slope, but this time by the slope of the total profit curve itself. For 
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marginal profitability is defined as the increment in profit per unit addition 
to advertising expenditure. But if advertising expenditure goes up from 
OR to OS, i.e., by amount AC, equals say $500,000, the curve shows that 
total profit will rise by amount C E, equals say $1,800,000. Hence, marginal 
profitability will equal 

1,800000 CE 


500,000 ^ 4G the slope of the total profit curve at A. 


Thus, the general 

Rute 7. Marginal x and total curves: Given any point A on a curve 
representing total z, then marginal x is equal to the slope of the curve at 
point A. 


As an exercise in the use of Rules 6 and 7, we may note that at point 
A the total profit curve slopes more steeply than does straight line OA. 
This means that marginal profits at A must exceed average profits so that, 
by Rule 5, above, average profits must be rising. We have already seen 
that this is in fact the case since the slope of OD exceeds that of OA. 
Observe also that at point D, just at the point where average profit stops 
rising, OD has the same slope as the total profit curve, so that average 
and marginal profits are equal, again as Rule 5 would lead us to expect. 

The graphic interpretation of marginal x as the slope of the total x 
curve (Rule 7) also casts some light on the fundamental Rule 1 of this 
chapter, that the optimal level of any activity requires its marginal yield 
to be zero. Thus, in Figure 1 we note that at advertising expenditure $3 
million, the proiit curve has a positive slope (a positive marginal profit 
yield of advertising). At point B the curve is going downhill (negative 
marginal profit). Neither of these can be a (profit) optimum advertising 
expenditure for at A profits can be increased by advertising more, whereas 
from B the firm ean raise its profits by advertising less. Only at M, where 
the profit curve is neither rising nor falling, so that the slope of the profit 
curve is zero, can the advertising expenditure level have reached its opti- 
mum value. But that is precisely what Rule 1 asserts for this case—an 
optimal advertising outlay for a profit-maximizing firm requires that the 
margina! profit yield of advertising (the slope of the total profit curve) 


be zero. 


5. Marginal and Average x Curves 


So much for the “total x curves.” We can also draw both “average x” 


and “marginal x” curves in a similar way (Figure 2). Their general rela- 
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tionship is largely determined by Rule 5 and Rule 3. Rule 3 tells us that 
the two curves must start out together (point C).4 

To the right of point C the relative height of the two curves is governed 
by Rule 5. Thus, it will be observed that output OQ, is the point of mini- 
mum average costs Q;M, so that there the average cost curve is horizontal 
(average neither rising nor falling). At that output marginal and average 


MARGINAL COST 


AVERAGE COST 


OUTPUT 


Figure 2 


costs are equal (intersection point M). To the right of this point, where 
average costs are rising, marginal costs lie above them. To the left of Qs, 
where the average cost curve has a negative slope, the marginal cost curve 
lies below the average cost curve.’ » 

Just as marginal and average z can be shown on a total z diagram, it is 
possible to represent total cost in terms of Figure 2. This can be done in 
two different ways—one in terms of the average cost curve and one in 
terms of the marginal cost curve. 

(a) Average curve representation of total x. Since total cost equals aver- 
age cost (cost per unit) multiplied by the number of units produced, then 
the total cost of output OQ; is OQ: X QR, which is the area of the rec- 
tangle OQ. S. In other words 


Rutz 8. Total x and average curves: The total cost of any output OQ is 


4 There is a minor difficulty here. Average cost is not defined for zero units and is 
only equal to marginal cost when output = 1 unit (compare Table 1). Hence they should 
meet somewhere to the right of the vertical axis. However, it is usually implicitly as- 
sumed, as is done here, that the unit of measurement is very small, so that our first 
unit lies at a microscopic distance from the origin. The two curves can then be taken to 
start out from a point practically on the vertical axis. 

5 But it will be noted that in the interval Q,Q: marginal cost is rising even though 
average cost is falling. 
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the area of the rectangle inscribed under the average cost curve which has 
OQ as its base. ` 

(b) Marginal curve representation of total x. Alternatively, we can find 
another area which represents total profit, this time with the aid of the 
marginal revenue curve. By Rule 4, total cost is the sum of the preceding 
marginal costs. But the marginal cost of the first unit is the area of the 
thin, shaded rectangle next to the vertical axis, since its height is the height 
of the marginal cost curve at that point. Similarly, the marginal cost of 
the second unit produced is represented by the next rectangle, etc. There- 
fore the total cost of producing OQ: units equals the sum of the preceding 
marginal costs equals the sum of the areas of all the thin rectangles equals 
area OQ: TC. More generally, 


Rute 9. Total x and the marginal curve: The total cost of producing OQ 
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Figure 3 


units of any commodity is that portion of the area under the marginal cost 
curve which lies above line 0Q. 


We can use these two representations of total x to derive a construction 
for the marginal curve from any average curve which is given to us. First, 
we assume that both curves are straight lines (Figure 3). It will now be 
proved that, given the average revenue curve CA, we obtain the marginal 
revenue curve by drawing in any horizontal line SR which ends on this 
average revenue curve, finding the midpoint W of SR, and drawing in the 
straight line OM which (in accord with Rule 3) begins at the same point 
as the average revenue curve and goes through this midpoint W. To prove 
the theorem the reader will have to go through the painful process of re- 
calling a bit of high school geometry. 

Step 1: The total revenue from any output OQ is given either by the 
area of rectangle OQRS inscribed under the average revenue curve or by 
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area OQTC under the marginal revenue curve. Hence, if the two curves 
are drawn correctly, the two areas must be equal. 

Step 2: Since these two areas have shaded area OQTWS in common, 
it follows that right triangles SWC and WTR must be equal in area. 

Step 3: Moreover, angle SWC must be the same as TWR (opposite 
angles of a vertex are necessarily equal). Therefore triangles SWC and 
WTR must be similar (two angles in common) as well as being equal in 
area. 

Step 4: These triangles must, as a result, be congruent so that we must 
have SW — WR, as was to be proved. In sum, 


Rute 10. Straight-line marginal and average curves: Given any straight- 
line average curve, to find the corresponding marginal curve, draw a 
straight line which begins where the average curve cuts the vertical axis, 
and goes through the midpoint of any horizontal line to the average curve. 


Where the average curve is not a straight line the construction is a bit 
more complex. We operate, in effect, by approximating the curve by a 
number of straight-line segments and using the preceding construction on 
these segments one at a time. For example, given the curved average 
revenue curve V A* (Figure 4), to find the marginal revenue at output 
OQ, draw the straight-line tangent CA to the average revenue curve at 
that output, and construct the corresponding straight-line marginal curve 
(call it the marginal line) CM, by the method just indicated (Rule 10). 
The point T, on CM directly below R, is then also on the marginal revenue 
curve to our original average curve VA*, at that output. Similarly, we find 
the marginal revenue at output OQ' by drawing the tangent line C'A' to 


REVENUE 
c 
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average curve V A* through point R’ (output OQ"). Now treat that straight 
line as an average revenue curve and find the corresponding marginal line 
C'M', and the point T” on that line which corresponds to output OQ’ is the 
marginal revenue for this output. In this way we can obtain as many 
marginal points T, T’, +++ as we wish, and the locus of these points WM* 
is the marginal revenue curve which we seek. In sum, 


Rute 11. General marginal and average curves: To find the marginal 
curve to an average curve which is not straight draw a straight line tangent 
to the average curve at some output Q, find the corresponding marginal 
line to this tangent by Rule 10, and thereby find the marginal point on 
that line at output Q. The marginal curve which we seek is the locus of 


all such marginal points. 


6. Marginal Analysis and Fixed Costs 


Fixed costs are defined as costs whose magnitude does not vary with 
the level of output, at least within some range. For example, the rent of a 
factory may be the same whether that factory is going full blast or running 
at half capacity (but once demand exceeds the factory’s capacity, rental 
becomes a variable cost if a second factory is put into operation). Sim- 
ilarly, the cost of an automobile license is the same whether the car is used 
to travel 5,000 or 30,000 miles in a year. The special features of a fixed 


cost are (1) that it all comes in one big lump once it is decided to enter 


on the operation, but (2) after it is incurred a further expansion in output 


makes no difference in its magnitude.’ 
These features are illustrated by the fixed cost figures in Table 2. This 


table is completely analogous to Table 1 (which should now be interpreted, 
for purposes of comparison, as a table of total, average, and marginal 
variable cost). 

It will be observed that the total fixed c 
same throughout, no matter what the num 


ost figure ($2,000) remains the 
ber of units, even if no units 


* A short proof can be given with the help of Equation (1) of footnote 3, above. This 


equation states that 
dA 
Mz == 
(1) ATQ dQ 


where M is the marginal z, A is average z, and Q is quantity (number of units). Consider, 
now, the straight-line average curve CA and curved average curve VA* which are tan- 
gent at point, E. At that point the outputs OQ are equal on both curves, their average 
revenues QE also coincide, and because they are tangent, their slopes dA/dQ must also 
be the same. Hence, by Equation (1), at tbat output the marginal values QT correspond- 
ing to the two curves must be the same—find one and you have the other. 

7 It is also shown in the chapter on integer programming that fixed costs can cause 


computational difficulties because of these characteristics. 
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TABLE 2 


"Total Average Marginal 
No. of Units Fixed Cost Fixed Cost Fixed Cost 


0 2000 = 2000 
1 2000 2000 0 
2 2000 1000 0 
3 2000 66624 0 
4 2000 500 0 
5 2000 400 0 


are produced or sold. Thus, once a train runs, it costs the same even if it 
goes empty, and (in the absence of bankruptcy) a contract to buy a fixed 
amount of raw material must be honored even if none of it is used. 

Skipping to the marginal fixed cost column, we see that it, too, follows a 
very simple pattern. The very decision to £o into an operation saddles 
management with its fixed costs—even before anything is produced. So, 
by convention, the zeroth unit is taken to expend the entire $2,000 (the 
marginal cost of the zeroth unit is $2,000). "Thereafter, production of any 
further units adds nothing to fixed costs (the marginal fixed cost of the 
first unit, and any unit thereafter, is zero) 5 A curve of marginal fixed costs, 
therefore, must always have the same shape—it coincides with the axes of 
the diagram. For, to the right of the origin, marginal fixed cost is always 
zero so that there the marginal cost curve lies along the horizontal axis, 
while at the origin the marginal cost is not zero (it rises above the horizontal 
axis along the vertical axis). 

The pattern of the average fixed cost curve is hardly more complicated. 
As our table shows, and for obvious reasons, the larger the number of units 
produced, the smaller will be the average fixed cost. The fixed costs are 
thereby spread over more units. However, the decline in average fixed costs 
is gradual, and though the average figure becomes smaller and smaller as 
the number of units increases, it never reaches zero. No matter by how 
large à number we divide $2,000 (no matter how thin the overhead costs 
are spread), there will still remain some unit cost which is greater than zero.? 


* In calculus terms, if T is total cost and Q is the number of units, then marginal cost 
is dT /dQ which equals zero whenever T is a constant (as it is for a fixed cost). 

? The average fixed cost curve satisfies a simple mathematical relationship. Let K be 
total fixed cost and x be the number of units produced. Then average fixed cost, y, is 
the total fixed cost divided by the number of units, i.e., we have the equation y = K/z. 
This is an equation which is rather familiar in the literature. Its graph, which is called 
a rectangular hyperbola, is doubtless one of the economists’ most popular curves— 
second only to the straight line. A rectangular hyperbola is depicted in Figure 3a of 
Chapter 9 and some of its properties are described in Section 4 of that chapter where it 
enters into the discussion of the theory of consumer demand. 

t 
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As a final observation on fixed costs, as illustrated by Table 2, notice 
that, for the first unit, average fixed cost is $2,000, whereas marginal fixed 
cost is, as usual, zero. We see, then, that fixed costs violate Rule 3 of 
Section 3 of this chapter, which states that for first units marginal x and 
average x are always equal. That rule holds only so long as the total cost 
(or total x) of zero units is zero.’ 

So much for the purely fixed costs. In practice, of course, costs are 
usually neither entirely fixed nor entirely variable. If we refer to such 
expenses as combined costs, we have, by definition, 


total combined cost = total variable cost + total fixe cost. 


It is then easy to show from this definition that we have completely ansio- 
gous relationships for average and marginal costs: 


average combined cost = average variable cost + average fixed cost?! 


and finally 
marginal combined cost = marginal variable cost + marginal fixed cost.” 


Moreover, since (except for the zeroth unit) marginal fixed cost is 
always zero, we have 


marginal combined cost = marginal variable cost, 


10 For if the zeroth unit has a positive marginal (total) cost, Mo, then the total cost 
of the first unit produced will be Mo + Mi where Mi is the marginal cost of that unit. 
As a result, the average cost of that one unit will be (Mo + M,)/1 = Mo + M; which is 


not equal to Mi, the marginal cost of the first unit. 
4 Proof: Writing TCC for total combined cost, etc., we have from our definition 


TCC = TVC + TFC. Dividing both sides of this equation by Q, the number of units, 
we obtain TCC/Q = TVC/Q + TFC/Q. But, again, by definition, TCC/Q = ACC, 
etc., so that we have our result ACC = AVC + AFC. 

12 Proof: Let TCC (20) represent the total combined cost of the 20th unit produced, 


etc. Then, from our definition, we have 
TCC (20) = TVC (20) + TFC (20) and TCC (19) = TVC (19) + TFC (19). 
Subtracting the latter equation from the former, we obtain 
TCC (20) — TCC (19) = TVC (20) — TVC (19) + TFC (20) — TFC (19). 


But since, by definition, 
MCC (20) = TCC (20) — TCC (19), etc., 
we have our result, MCC (20) = MVC (20) + MFC (20). More generally, the same 


argument also obviously holds for any Qth unit, as well as for the 20th. Alternatively, 
this follows directly from the rules of the differential calculus, for if TCC = TVC + TFC, 


we have 


dTCC/dQ = dTVC/dQ + aTFC/dQ. 
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that is (except at a zero output level), fixed costs never affect marginal 
costs in any way—a result which will be applied at a number of points 
later in this volume. 


7. Average vs. Marginal Figures in Business Practice 


In business operations one often encounters rule-of-thumb calculations 
which serve as substitutes for the operations researcher’s optimality com- 
putations. When these business calculations are explicit, they are frequently 
made in terms of average rather than marginal quantities, that is, in terms 
of the elements of the third rather than those of the fourth column of 
Table 1. We have already seen in Section 1 that decisions based on such 
average data are not likely to be anywhere near optimal, yet it is tempting 
to reason on the basis of unit (average) costs or revenues or profits, largely 
because of the difficulty of marginal data collection. It is almost always 
harder to obtain marginal figures than to acquire average data, for several 
reasons: 


1. Almost all accounting information is in the form of average or total 
rather than marginal figures. Tax computations and a number of other 
uses of accounting data require that this be so and this usage is well in- 
grained by tradition. 

2. By its very nature, marginal information often represents the an- 
swers to hypothetical questions—information beyond the range of the 
firm’s actual experience. One must ask, for example, what will be the 
effect on the firm’s profits of an increase in expenditure of type A (the 
marginal profitability of A), whether or not the firm has ever tried it. But 
note that this hypothetical question is precisely what must be answered in 
rational decision-making to determine whether or not to increase expendi- 
ture A. That is exactly why these difficult-to-obtain marginal figures are 
essential for good decision-making. 

3. Even where some relevant data are available from the past history 
of a company, it is much easier to collect the statistics required for average 
than for marginal figures. A single observation, that the total cost of 
producing 500 units of some output is $15,000, yields the information that 
its average cost is $15,000/500 = $30. But it takes at least one more ob- 
servation, say, that the total cost of 510 units is $15,050, to yield the guess 
that the marginal cost is $5 (since it costs an additional $50 to increase 
output by 10 units). And, in practice, many more than two observations 
will usually be required for any sort of reliable guess on marginal magni- 
tudes. As we have seen, marginal z (e.g., cost) is the slope of a total x 
(cost) curve, and it is surely rash to guess at the slope of any such curve 
on the basis of only two statistical observations. 
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Another reason for the popularity of the concept of the average cost or 
unit profit is that it is so simple and straightforward. Anyone knows that 
if profits per unit are high, the firm must be making money. But, as we 
have already seen, this sort of “practical” reasoning can be seriously 
fallacious. To illustrate the point once more, the decision to produce more 
of a commodity which brings in a healthy profit per unit (average profit) 
may be very costly to the firm, as Table 1 readily shows. There the unit 
profit on the product varies between $50 and $90 and this may appear to 
be a very healthy rate of return if the average cost of the item is, say, in 
the vicinity of $30. Nevertheless, the decision to increase the output of 
this good from 4 to 5 (million) units is tantamount to incurring a loss of 
$30 (marginal profit = —30)! As is shown there, the inereased output can 
actually reduce the firm’s total take by this amount, despite the high level 
of average profit. 

The use of average data in any optimization problem can lead to such 
unsatisfactory results. The logic of the difficulty is not hard to explain. 
For example, in any allocation problem—say the reassignment of ad- 
vertising funds—the question is not whether money already spent in pub- 
licizing product A has brought high returns. What must be determined is 
whether the spending of additional money can be justified. It may well be 
that the public is already saturated with singing commercials, contests, 
and free samples of product A, and although the money already spent on 
the item brought in very satisfactory returns, more such company ex- 
penditures on this product might even repel the public. Rather, the money 
may be better spent on the promotion of some product B, on which previous 
outlays were so niggardly as to be almost completely ineffective, but where 
the payoff to additional expenditures may: be large because they permit 
some sort of public perception threshold to be reached. 


8. Averages as Approximations fo Marginal Figures 


Tt is clear, then, that wherever there is a difference between average 
and marginal data, it is the latter which must be given prior consideration 
in an optimization problem.’ Unfortunately, as has been shown, marginal 
data may be difficult and in some cases, for practical purposes, impossible 
to come by. It is therefore sometimes necessary to make do with average 
figures. For this purpose one must understand the relationship between 
average and marginal figures to recognize the circumstances under which 
the one can be expected to provide a reasonably good approximation to 


ays be consulted—marginal data may 
arrangement involves an average 
nt of the business altogether. 


13 However, average figures must also alw: 
show the best that can be done, but if even this “best” 
loss of $2 per unit, it may be optimal to drop that segme 
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the other, and to determine, when this is not the case, what sort of rough 
adjustments in the average data can be made to bring them closer to the 
unknown marginal figures. 

For this purpose we can again employ Rule 5 of Section 3 [and Equa- 
tion (1) of footnote 3]. This yields the following correction procedures: 


1. Suppose the average figure rises as the number of units increases 
(e.g., as production, advertising expenditure, warehouse size, or the 
number of trucks used by the company goes up). Then, since marginal z 
must be above average x, any average figure must be revised upward to 
obtain an estimate of the corresponding marginal figure. Moreover, the 
more rapid the rise in average z, the greater will be the correction required. 

2. Similarly, when average z is falling, the marginal figure will lie 
below the average figure, so that any average should be reduced to obtain 
the corresponding marginal datum. 

3. Only when average z is neither increasing nor decreasing will the 
two figures coincide so that the average figure will need no readjustment. 


In more concrete economic terms, this means that the following types 
of adjustment must be made to estimate marginal from average data: 


(a) In cost figures, if there is reason to believe there are economies of 
large-scale production (falling average costs), marginal cost will be less 
than average cost, so that the average cost figure must be adjusted down- 
ward to yield a better approximation to the marginal cost figure. On the 
other hand, if observation suggests the presence of important diminishing 
returns, the average cost figure must be adjusted upward. 

(b) In the case of productivity figures, the reverse holds. Economies of 
large scale (increasing average product per dollar of outlay) require an 
upward revision of average product figures to yield a “guestimate” of 
marginal product, and diminishing returns require that the average figure 
be reduced. 

(c) In the case of revenue data, since demand (average revenue) curves 
are normally downward sloping, the marginal revenue will be lower than 
the average revenue. The more inelastic the demand, the greater the 
difference will be.!* 


A little experience in looking about and asking the proper questions 
should soon permit the analyst to recognize the presence of such phenom- 
ena as economies of large scale and diminishing returns, and on this basis 
to use the preceding rules to obtain rough marginal data. 


14 By the methods shown in the footnotes of Chapter 9, Section 4, it can be shown 
that the relationship is mr = ar(1 — 1/E), where E is the price elasticity of demand. 
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These results show that there is something intermediate between the 
counsel of perfection which regards anything but marginal data as abso- 
lutely useless for decision-making and the uncritical calculations which. 
employ average or total figures largely as they are received from the ac- 
countants’ records. Because economic information is, in any event, no- 
toriously inaccurate, it may be that the crude and approximative marginal 
figures which are obtained with the aid of the preceding rules will often be 
as satisfactory as anything that can reasonably be hoped for. Even if this 
is not quite true, experience seems to suggest that such rough adjustments 
will in many cases eliminate tre bulk of the error which arises from the 
use of average data as a basis for decision-making. It is even plausible that 
efforts to obtain better marginal data will sometimes cost more than they 
can add to the profits of the firm. à 


9. First- and Second-Order Optimality Conditions 


The marginal rules for optimal decision-making which have been dis- 
cussed so far in this chapter all suffer from & serious omission. Even in 
the context within which they have been described they represent necessary 
but not sufficient conditions for an optimum. That is, any arrangement 
which is optimal must satisfy these rules, but there are likely to be some 
alternative nonoptimal situations which also satisfy our marginal, or 
first-order, conditions. 

This point is discussed in detail in Section 5 of Chapter 4. However, it 
is of such importance that it must be called to the reader’s attention here 
as well. 

The logic of the first-order requirement that an activity should, if 
possible, be carried to a point where its marginal yield is zero has already 
been discussed in Section 4. As will be recalled, it was pointed out that the 
marginal analysis seeks the highest point on the graph representing profit 
(or whatever other variable one desires to maximize) by hunting for a 
level point such as M in Figure 1. Only at such a level point (i.e., only at à 
point where the marginal yield of a small decision change is zero) can we 
be at the top of a smoothly curved hill of the sort depicted in the diagram. 
At any point such as E or B where the marginal yield is nonzero, we must 
be either on the upward- or downward-sloping side of the hill. Hence, unless 
the marginal yield is zero we cannot be at a maximum, and the maximiza- 
tion method of the marginal analysis, then, consists in the hunt for points 
(output levels) where marginal yield is zero. 

Once having found any such points we can be certain that all points 
which have been ruled out (i.e., all points corresponding to nonzero mar- 
ginal viclds) are not the maximum which we seek. However, it is perfectly 


possihie that among the points which remain to us—those whose marginal 
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yields are zero—there will be some which are not maximal. This possibility 
is illustrated in Figure 4 of the next chapter. There points A and B as well 
as all points on the segment CD are of zero slope—they correspond to zero 
marginal yields. Yet only A is a true maximum. Points on CD lie on a 
plateau whereas point B is a minimum! 

Therefore, to be sure whether we have found a maximum or a minimum 
when we have located a point whose marginal yield is zero, we must ex- 
amine it in terms of what are called the second-order conditions.!5 These 
require for a minimum that in the vicinity of the point in question the curve 
be U-shaped so that our level point does, indeed, lie at the bottom of a 
valley. On the other hand, for a maximum the second-order conditions re- 
quire that in the neighborhood of our point the curve take the shape of 
an upside-down U. A systematic and precise formulation of these second- 
order conditions is provided in the next chapter. 

We conclude, then, that only if both the first-order (zero marginal yield) 
and second-order (curvature) conditions are satisfied can we be certain 
that we have found the maximal point. The same two types of condition 
must, of course, be examined in the search for a minimum. 


10. Global, Local, and Corner Maxima 


If the second-order conditions are satisfied throughout a graph, no 
further problems of principle will usually arise. But in some cases the 
second-order conditions will hold only for limited ranges of the figure or 
they may not even be satisfied at any point in a graph. In that case we 
may run into several added troubles. 

The first of these is the possibility that we may have a multiplicity of 
maxima—a series of little hills at various locations on the landscape 
(Figure 5). Here we have four profit hilltops, A, B, C, and D. 

True, one of them, C, towers above the others and is the highest point 
which we undoubtedly wish to find. Such a highest point is called the 
global maximum while the other hilltops are referred to as local maxima. 
But, unfortunately, the first- and second-order conditions are satisfied as 
perfectly at points A, B, and D as they are at the global maximum C. 
There is, then, no way in which the standard techniques of marginal 
analysis can by themselves distinguish between global and local maxima. 
Other computational difficulties which arise in the presence of the two types 
of maxima will be discussed in the chapter on nonlinear programming. 


15 The reader who has studied some differential calculus will recognize the second- 
order condition as the usual requirement that the second derivative be negztive in the 
case of a maximum and positive for a minimum. See Section 5 of the next chapter. 
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There is also a second possible cause for concern which is particularly 
likely to arise when the second-order conditions are violated. This is the 
corner maximum, although that sort of maximum may occur even if the 
second-order conditions hold. Figure 6 depicts corner maxima of both 
types. Suppose we are dealing with a plant whose output capacity is limited 
to Qcap. Suppose, moreover, that its total profit function is DD’ whose 
shape clearly violates the second-order conditions. Note that the maximal 
point on such a curve is not likely to occur somewhere in the vicinity of its 
center. Rather, its highest point will typically occur at one of its ends, 
point L (zero output) or point M (capacity output). Such a maximum 
which oceurs where a curve cuts one of the axes or at some other end point 
of its relevant range is called a corner maximum (in contrast with the other 
type of maximum, the interior maximum, such as M in Figure 1). 

The figure also depicts a corner maximum which occurs with a profit 


TOTAL 
PROFIT 


—— [*] 
Q output Qcap output 


Figure 5 Figure 6 


function which satisfies the second-order conditions (curve SS’). Here the 
highest attainable point is not the level point G, but the corner maximum 
point N which corresponds to the capacity output. 

For our present purposes corner maxima are important because they 
cause the usual rules of the marginal analysis to break down. Notice that 
at none of the maximum points in Figure 6—L, M, or N—is the pertinent 
profit curve level. That is, none of those points fulfills the requirement 
that marginal yield be equal to zero. Since the marginal analysis cannot 
cope with corner maxima, it was necessary to invent a special analytic 
procedure, mathematical programming, to deal with these and other 
closely related phenomena. 

It must be observed before we leave this subject that the occurrence 
of corner maxima is rather common in economic problems. For, by defini- 
tion, a point lies above one of the axes in a diagram whenever the value of 
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the corresponding variable is zero. Thus, if a firm finds that it can maximize 
its profits by eliminating some product from its line (reducing that output 
to zero), it must have attained a corner maximum. The same must also be 
true of a consumer who decides to purchase only some of the many com- 
modities which are offered for sale to him (for any one of the remaining 
items 7 his quantity purchased will be Q; = 0). This must be so because a 
corner maximum is one where the value of at least one variable has, as it 
were, been pushed as far as it can go and zero is, of course, as far as an 
output can be reduced. We may notice also why the marginal condition 
breaks down in this case. Once an output has been reduced to zero, even 
if its marginal yield is negative so that a further cut would be desirable, 
there is nothing anyone can do about it. We have reached our optimal 
position even though marginal yield is not zero. 

In many areas of economic analysis it has proved convenient to proceed 
on the assumption that the relevant maxima were of the interior rather 
than the corner variety. However, the importance of maxima of the latter 
sort must not be forgotten. Their significance and the methods by which 
they are analyzed will become clearer in the chapters on mathematical 
programming and activity analysis. 


11. The Second-Order Conditions and Stability 


One characteristic of the second-order conditions remains to be cleared 
up. In some earlier writings, solutions which satisfy both the first- and 
second-order conditions were referred to as stable equilibria, while solutions 
which satisfy the first-order but not the second-order conditions were 
called unstable equilibria. It is easy to see why this terminology was em- 
ployed. If someone departs from a solution point of the first variety, e.g., 
from quantity Q4 in Figure 4 of Chapter 4, he is motivated to try to return 
to the equilibrium point Q4. On the other hand, if something pushes some- 
one away from Qs where the second-order condition is violated, he will 
have no desire whatsoever to return to that quantity. 

However, more recently it has been pointed out that this terminology 
is misleading. Q5 is nowadays not called an unstable equilibrium quantity. 
Rather, it is now clear that Qs is no equilibrium at all. Not only is there 
no motivation to return to Qg after leaving it; there is really no reason for 
anyone to wish to go there in the first place. An “equilibrium” may be 
defined as a set of variable values which, once attained, do not automatically 
set. change-producing forces inlo motion. But even though the first-order 
maximum conditions are satisfied at Qg no one will want to stay at that 
quantity—it is a minimum and the value of output will automatically be 
changed from Qz by any rational decision-maker. 
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On the other hand, Q4 is an equilibrium point. Once the maximizing 
decision-maker gets there he will do nothing to move away. But this is no 
reason to consider it stable. Stability of an equilibrium means that if for 
some reason there is a displacement from the equilibrium point the system will 
eventually return at least reasonably close to that point. Why, then, may we 
doubt that Q4 will be stable? The answer is that, though the decision- 
maker may want to make things move toward Q4, there is many a slip . . . . 
Suppose, for example, that Q4 is an inventory level and that the company 
reduces its production if inventory happens to be above Q4 and increases 
its output if inventory is below that level. Then it is possible (and it does 
happen in practice) that companies’ production-control rules lead to over- 
compensation. If inventory starts out above Q4, production may be cur- 
tailed so sharply that inventory level falls well below its target. This, in 
turn, may cause an extreme increase in production and a rebuilding of 
surplus inventory, and so on, ad infinitum. Here we see, then, that though 
the objective of management is perfectly clear—it seeks to attain the 
equilibrium (profit-maximizing) inventory level Q4—the dynamics of its 
adjustment process prevent it from attaining that level. The equilibrium 
just is not stable. But we are able to test for stability only by examining 
explicitly the dynamics of the adjustment process, and the second-order 
conditions, at least by themselves, do not enable us to settle the issue. 
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1. Differential Calculus and Marginal Analysis 


Before we get down to the differential calculus and its relationship to 
marginal analysis, it is necessary to translate the marginal concept into 
algebraic notation. Suppose, for example, that an additional $5 in ad- 
vertising expenditure were to increase a firm’s sales by $100. We then 
evaluate the marginal sales contribution of advertising as 100/5 = 20. In 
algebraic notation, if S represents sales volume and A symbolizes the 
amount of advertising, it is customary to represent an addition to (change 
in) S and A by AS and AA respectively,’ so that the marginal sales con- 
tribution of advertising may be written as AS/AA. 

It will be noted that this expression does not specify whether the change 
in advertising expenditure is large or small—in ‘the numerical example we 
took AA — $5. But this ambiguity leads to a further difficulty because it 
means that the value of the marginal yield figure itself may not be deter- 
mined. For example, suppose we encounter a rather extreme diminishing- 
returns case in which the first thousand dollars adds $40,000 to the sales 
of a small retailer, but a second thousand in advertising adds very little 
more, and a third thousand poured into a saturated market repels customers 


1 A is the (upper-case) Greek letter delta. 
42 
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and actually reduces sales by $10,000. If we take AA = $1, 000, then, the 
marginal yield, 


9219970 


AS 40,000 
AA. ~ 1,000 © S 
Butif AA — $2,000, we have 
A 40 0 
AS _ 40,000 +0 _ ,, 


AA 2,000 
and finally, if AA = $3,000, we obtain 


AS . 40,000 + 0 — 10,000 


AA | 3,000 = 


In other words, the value of the marginal yield of advertising varies with 
the magnitude of AA, the magnitude of the increment in advertising ex- 
penditure whose effect we decide to consider, and it is easy to see that a 
similar problem arises even in less extreme examples. 

The heart of the difficulty i is that if AA is a large number, the nisin 
measure becomes a rather crude over-all representation of thé effects of & 
change in expenditure, and it therefore becomes a relatively blunt decision-: : 
making tool. For example, if a marginal computation is made using AA’ = 
$3,000 in the preceding illustration, we have already noted’ that we obtain 


AS 40000 4-010000 yon ti od 
AA 3,000 ox voladw of & 


fy yw vxoloniarisi 
In other words, this calculation indicates that.a dollar in advertising adds. 
$10 to this retailer's sales'and may suggest:to him that, it pays hand:oyer,, 
fist to spend even more on promotion. But»a:more. detailed. calculation 
shows that he should actually have oo after, his first: 5,090 promo- 
tional outlay. dau í 291 
We see, then, that the larger the: banh af the Bins spe .chi ge. 
in à decision variable whose effect we calculate, iei, the larger, the. value of; 
the denominator in the marginal value fraction,» the-eruderchego hi 
marginal measure. It' becomes ‘ai rough average measure of, t 
advertising or whatever we happen:to:be,examining;, and, 
ceding example, i& may conceal more than it reveals. 
“This naturally. suggests “that we e = stick 
But, at, least, in principle,, so long as we 
crement in the decision variable, ie.,,i 
withthe problem that a calculation employing; an. even finer jun 
might give us even 'more detailed — Vernm req t information. 
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The basic approach of the differential calculus is designed to cope with 
precisely this problem. It proceeds by operating, conceptually, with smaller 
and smaller units and taking as the value of the marginal fraction the limit of 
the value of these fractions as the magnitude of the denominator decreases in- 
definitely. For example, suppose we obtain the data in the following table: 


0.5 


0.25 


3.1 1.525 0.75625 ee 


AS/^A 3.05 


3.025 


As we consider smaller and smaller changes in advertising expenditure, 
their contribution to sales also grows correspondingly insignificant,” e.g., 
as advertising expenditure drops from 8 to 2, its contribution to sales falls 
from 30.4 to 6.4. But the ratio of these magnitudes, AS/AA, may never- 
theless vary relatively little because both numerator and denominator are 
going in the same direction. In our rather obviously cooked-up example 
the ratio AS/AA shows a remarkably simple pattern, and it is easy to see 
where it is heading. Keep up the process of cutting down on AA long 
enough and the difference between the corresponding value of AS/ AA and 
the number 3 will not be worth mentioning—it can be made to approximate 
3 to whatever degree of accuracy we desire. In standard mathematical 
terminology we say that as AA approaches zero the fraction AS/AA ap- 
proaches the limit 3. This limit number 3 is represented by the symbol 
dS/dA or, sometimes, by S’ and is called the first derivative of sales with 
respect to advertising expenditure. 

More generally, then, the first derivative dy/dz of any variable, y, with 
respect to another variable, z, may be described as the limit value of the 
marginal change in y per unit change in z when the change in z is made to 
be smaller and smaller, i.e., when the change approaches zero. This com- 
bination of concepts—the notion of a limit number which is approached 
by an infinite sequence of numbers and the definition of the derivative— 
is the foundation of differential calculus. 


3 Of course it must be recognized that this illustration is economic nonsense. Increases 
in most types of advertising cannot be bought in unite as small as $8—one cannot 
even talk to an advertising man for that price. It should also be noted that a minimum 


sales, but the table does not assume that we start off with zero advertising. We may be 
asking whether the fitm should increase or decrease its current million-dollar budget 80 
that even if AA is only $6 the total A is increased to $1,000,006. 
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Let us see now how these definitions can be employed. Suppose that 
we have, by statistical or other means, obtained an algebraic relationship 
between sales and advertising expenditure. To illustrate the logic of the 
procedure, we consider three (rather implausible) cases: 


(a) S = 50,000 (S a constant unaffected by A). 
(b) S=3A (relationship of proportionality). 
(c) S = 3A? (second-degree relationship). 


The first two cases are trivial and we can easily guess at the derivatives 
that will be involved. In the first case sales are fixed and independent of 
the level of advertising so that we may conclude, correctly, that dS/dA = 0, 
i.e, that the marginal sales yield of advertising is zero. This is a special 
case of the general 


RuLE 1. Constants: The derivative of any constant with respect to any 
variable is always zero. 


The second (linear) case is only slightly more complex. It is obvious- 


that the expression in question tells us that every unit increase in A in- 
creases sales by $3, for sales then increase from 3A to3(A + 1) = 34 +3. 
Therefore, dS/dA = 3. We thus have 


Rue 2. First-degree terms: In any first-degree relationship y = bz, 
where b is any number, we have 


dy 
E b. 

It is only in the third-case relationship S = 3A? that the calculation 
becomes slightly more complicated andthat it seems worth going through a 
systematic derivation of the value of dy/dx. The following rather simple 
algebraic argument is typical of the procedures used in arriving at the 
various differentiation formulae which are listed below. If the reader un- 
derstands it thoroughly, he will have some comprehension of the elements 
of the logical structure of the differential calculus. 

Our object is to evaluate dS/dA in the case S = 3A?. For this purpose 
we begin by determining the marginal fraction AS/AA, then finding what 
happens to it as AA approaches zero, just as described in the previous 
section. We start off by adding any increment AA to the A in our equation 
S = 34? to see what effect this has on S. This changes the value of S, so 
that, by the definition of AS, the procedure replaces S by S + AS. We 


\ 


2m ulus 
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152 foitne 


therefore obtain 
Step 1: 
S+ AS = 3(A + AA)? = 3[A?-- 24 AA + (AA)?] (by squaring and 


multiplying out) 
= 84? + 6A AA + 3(AA)?. 


But we are looking for AS, not for S + AS. We therefore subtract the 
expression S = 3A? from the preceding S + AS equation. This gives us 


Step 2: 
AS — SJ- AS —S=3A*+6AAA+3(AA)? — 3A? = 6A AA + 3(^A)?. 
That is, 
AS = 6A AA + 3(AA)?. 


We now find the marginal sales contribution of advertising, AS/AA, by 
dividing both sides of the equation for AS by AA to obtain 


Step 3: 
AS 6A AA " 3(AA)? 
AA AA AA 


= 6A + 3 AA. 


But the derivative, dS/dA, is the limit of AS/AA as AA approaches zero 
(written AA — 0), i.e., (in conventional notation) we calculate 


Step 4: 
dS AS 
mc = Lim = 
Ej Tam SA F (6A +344) »». 
that is, the term 3A A, which clearly approaches zero when AA approaches 
zero, simply drops out in me PES Hence we have our result for S —3A?, 
that in this case dS/dA — 
More generally (the desde follows the preceding one step by step)? 
we have 


3 There is one complication. Raising (z + Az) to the bth power is ordinarily more 
. eomplicated than squaring it. If b is an integer (a whole number) we employ the binomial 
theorem of high school algebra to multiply out (z + Az)’. The proof then proceeds 
exactly as above. 
Note that Rules 1 and 2 are both special <eses of Rule 3 in which b = 0 and b = 1, 
respectively. 
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Rute 3. Power terms: For any constants a and b, if y = az*, then 


89, 1 
i UE 


Example: The derivative of 4z'? is 4-102? = 4022. 


Several other frequently used rules of the differential caleulus are now 
given and their use illustrated: 


Rute 4. Sums: The derivative of a sum of several terms is the sum of the 
derivatives of these terms, i.e., if y = yı + y», then 
jar 
dz dx dx 
Example: Given y = 500 — 2z + 5z*. We know by Rules 1-3 that the deriva- 
tive of 500 is zero, that of —2z is —2, and that of 524 is 4 X 52° = 2023. Hence, 
in this case, 


i 0— 2+ 202 = —2 + 202°. 
dz 
Rute 5. Miscellaneous formulas: . 
(a) ify = asin bz, dy/dx = ab cos bz 
(b) if y = acos bz, dy/dx = —ab sin bz 
(c)* ify = ae, dy/dx = bae. 


Thus, in particular, ae* is an expression (indeed the only expression) 
which is equal to its own derivative, i.e., 


(d) ify = ae, dy/dx = ae*(— y) 
(e)® ify = a log. bz, dy/dx = ab/z. 


4 Here e (= 2.718 approximately) is the number that represents the principal wnich 
would accrue after one year if one dollar were compounded at every instant over the year 
at & 100 per cent (annual) rate of interest. This number occupies an extremely important 
position in the differential calculus, particularly in the theory of growth. 

5 Log, z (the logarithm of z to the base e) is defined by the relationship x = el». 
In other words, log. z is a number L, such that when e is raised to the power L the result 
is equal to z. This number has most of the convenient, properties of the common loga- 
rithm (i.e., the logarithm to the base 10; z — 10!e&107). For example, by the definition 
that esc? = e+, for the product of any two numbers z and z, 


mez = glozet.glomet = qlomezilomas, 


Thus two numbers can be multiplied by adding up their logarithms (to the base e) and 
then finding the number whose logarithm is equal to this sum. 

One reason logarithms to the base e are employed in much scientific work is that 
they have such a simple differentiation rule. 
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Route 6. Products: The derivative of the product of two expressions is 
the product of the (undifferentiated) second expression multiplied by the 


derivative of the first plus the first multiplied by the derivative of the 
second; that is, if y = yiy2, then 


Example: Given y = 6z*sin (3z) find dy/dz. We know by Rule 3 that the de- 
rivative of 624 is 24z*, and by Rule 5a that the derivative of sin (3z) is $ cos $z. 
Hence by Rule 6, 


ot = 345^ sn (2) + GC} cos (2) ] 
= 243? sin (4x) + 2z* cos (3). 


Rute 7. Division: If y is the ratio of two expressions, its derivative is 
the denominator multiplied by the derivative of the numerator minus the 
numerator multiplied by the derivative of the denominator all divided by 
the square of the denominator®; i.e., if y = y1/y2 then 


dy _ y2(dy:/dx) — yı(dy:/dz) 
dr Vi 


Example: If y = log, z/e* then, since the derivative of log, z is 1/z and that 
of e is 3e**, we have 


dy _ e*(1/z) — 3(log. z)e* 
dz (e)? 
where, by definition, (£7)? = e®. á 


Rutz 8. The chain rule: If y is a function of some variable z whose value 
is, in turn, a function of another variable, z, then the derivative of y with 
respect to z equals the derivative of y with respect to z multiplied by the 
derivative of z with respect to z, i.e., 


dy/dx = dy/dz - dz/dz. 

To illustrate Rule 8, consider the four illustrative functions (1) y — e*, 
wherez = 5z^; ; y = sin z, wherez = e*; (3) y = cos z, wherez = sin z; 
and (4) y = 5z?, where z = log, x. Their derivare ara 

(1^) e'15z? = 1522657? 

(2') (cos z)e* = e” cos e^, 

(8) —sinz cos z = — sin (sin z) cos z, and 

(4") 10z/z = 10 loge z/z. 


* Rule 7 can be derived directly from Ruler 6, 3, and 8. 
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The chain rule also permits us to take relatively. complicated functions 
and break them into two simpler components each of which can be 
differentiated separately. For example, consider the relationship (1’’) 
iem pt 

By arbitrarily setting z = 52° in (1”) it is immediately transformed into 
our first problem (1) to which the chain rule applies immediately. Similarly, 
we transform (2") y = sin (e7), (3") y = cos (sin z), and (4") y = 5(log. x)? 
into (2), (3), and (4) by setting, respectively, z = e* in (27), z = sin z in 
(3), and z = log, z in (4"). 

Thus, in general terms given the function of a function y = g[f(z)] we 
may rewrite it as y = g(z), where z = f(x). Rule 8-enables us to divide up 
and conquer such a problem. To find dy/dx we obtain the more easily found 
derivatives dy/dz and dz/dz and then get our answer just by multiplying 


them together. 


Example: Given y = log, (327), to find dy/dz we write 
y = log,2 z= 32 
dy/dz = l/z dz/dz = 6z 


and therefore, substituting into the equation of Rule 8, we have (since z = 3z?) 


Dy duode ly a d 
dz dz dz del 7 mi 
PROBLEMS 
Differentiate the following: 
1. y = liz? — 162? + 3 5. y = &/sinz 
2. y = —4z? + 17? + 2cos 4z 6. y = e*/logz 
8. y = 72-9 (= 7/z9) y= 3 sin 524 
4.y = e* sin z 8. y = 8e* 


3. Geometric Interpretation: The First-Order Maximum Condition 


Tt is useful at this point to translate the derivative into graphic terms. 
Consider the relationship between the quantity Q of a commodity marketed 
by some firm and the total profit R which accrues to.it. Such a graph is 


depicted as curve PP’ in Figure 1. 
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First, it will be recalled that the slope of a line is defined as the increase 
in its height per unit move to the right. For example, as we move to the 


TOTAL 
PROFIT 


[*] 


Figure 1 


right from point Q to Qu, the curve vises from Ro to Ri. Hence its (average) 
slope over this stretch is (P, — /(Q: — Qo) or BC/AB. Similarly, we 
note that as we move to the right irom point Z to point D, the graph goes 
downhill, so that the slope EM , 4D is negative (there is a negative increase 
in the height of the curve). 

Now the marginal profitability of an increase in output AQ is defined 
as AR over AQ. But it will be observed that AQ, the increment in Q, is 
Qı — Qo and, similarly, that AR is Ry — Ro. Hence the slope of the curve 
and the marginal profitability are the same number. More generally, let y 
be some variable whose value depends on x. Then the slope of the curve of 
this relationship represents the marginal effect on y of a change in x. 

It will be noted that the slope of the curve changes as we move along 
the diagram. That is why the value of AR/AQ is ambiguous and why we 
turn to the concept of the derivative, By letting the interval which repre- 
sents AQ grow smaller and smaller, it more and more closely approximates a 
single point. That is, if from AQ we shift our attention to smaller interval 
AQ’, and so on, we ultimately approach a state where nothing is left but 
point Qo. We then interpret the derivative of the function at point Qo as 
the slope of the graph at that point. 

However, this slope itself cannot be defined in the same way as the slope 
of a finite interval, because at a point there is neither a change in Q nor any 
change in R. That is, at A there is neither a AQ nor a AR, or if we wish, 
they are both equal to zero. 
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This is where the limit process is invoked. The slope at A is defined as 
the limit of the slope of AC as AQ approaches zero. Geometrically, this 
limit turns out to be the slope of the tangent TT’ at point A. We see this 
by noting that our original marginal figure AR/AQ is the slope of straight- 
line chord AC. Our second approximation is the slope of a shorter chord 
beginning at A, etc. It is clear intuitively that this sequence of slopes ap- 
proaches the slope of the tangent qo 

In sum, the derivative of any relationship between y and z with respect 
to x at some value z* of x is represented by the slope of the tangent (at 
point x = x*) to the curve which represents the relationship. 

We can use this interpretation to discuss the problem of optimization. 
Suppose management desires to maximize the firm's total profits. The 
total profit graph in Figure 1 is shaped like a hill, and the output OQ 
which maximizes profits is the point directly below the peak of the hill M. 

This gives us a derivative criterion for locating point M. At points to 
the left of Qm the slope of the curve (the derivative) is positive. Such 
points cannot be optimal because a movement to the right from any one 
of them will increase profits. Similarly, any point to the right of Qm cannot 
be optimal because there we are going downhill (negative derivative) so 
that a retreat (a reduction in the size of output Q) will increase profits. 
Only when the derivative is zero, so that we are neither at the upgrade nor 
downgrade side of a profit hill, is it possible for us to be at an optimal point. 
Note that this condition is satisfied at point M, the top of the profit curve, 
which is level because it is the border line between the uphill and the 
downhill segments of the curve. 

This result ‘corresponds to our first-order maximum condition of mar- 
ginal analysis rule 1 (Section 2 of Chapter 3) that any activity should, if 
possible, be carried to a point where its marginal net benefit (its first 
derivative) is zero. 


Example: Find the output Q which maximizes profit when given the rela- 


tionship R = 300 + 1,200 — Q*. 
Differentiating, we hive 


aR _ 1,200 2Q 
dQ ^" i 


Hence the only point at which this derivative is zero is where 
1,200 — 2Q = 0, 


ie., where 2Q — 1,200 or Q = 600. 
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4. Nondifferentiability and Limited Variable Range Problems 


"The results of the preceding section may be summed up as the following 
first-order rule: To find the value of z which maximizes the value of y, 
given some relationship y — f(x) between these variables, find the deriva- 
tive, dy/dz, set it equal to zero, and solve for z. This procedure is, indeed, 
widely applicable but it has important limitations which must be under- 
Stood clearly. We shall now examine several cases in which the rule is 
meaningless, inapplicable, or completely incorrect. Each of these cases is 
of fundamental importance and not just a minor exception to be acknowl- 
edged summarily and then forgotten. 

Case 1. Discontinuities and Kinks 


In Figures 2a and 2b we represent relationships between y and z which 
exhibit two important types of irregularity. At point B there is a sharp 
corner or kink in the graph. Above point x, there is an even more serious 
type of irregularity—the graph has a complete break, or discontinuity, AA’. 
The difficulty is that at such points the slope, that is, the derivative, is not 
even defined. There is no tangent to the curves at the points directly above 
Za Or T». It is to be noted that points A and B both represent maxima of 
the functions, i.e., at z, or at 2s, y becomes as large as possible. But at 


y y 
A B 
Ww 
SS ey, 
9 Xa - x 9 Xb X 
(a) (b) 
Figure 2 


neither point is dy/dz equal to zero because we cannot even impute the 
usual meaning to the concept at such a point. Hence we conclude: 

In the presence of kinks or discontinuities the derivative is not defined, 
so it may not be possible to employ the maximization criterion dy/dz = 0. 


Case 2. Limitations on the Values of the Variables 


In Figures 3a and 3b are represented two cases in which the levels of 
output Q are restricted. The unlikely situation in Figure 3a might be ve 
ferred to as the crop restriction subsidy case. Here the firm is in the peculiar 
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position that the less it produces and sells, the higher the profit it makes. 
Clearly, however, the smallest output Q which the firm can produce is 
zero, so that this is the point of maximum profits. Actually, analogous 
situations are frequently encountered in the multiproduct firms which are 
typical of our economy. If the production of one of the outputs of the firm 
ineurs a loss, the most profitable (least costly) output of that product will 
be zero, though the total profit of the firm as a whole (E) may be a positive 


(a) (b) 


Figuro 3 


number because its other products bring in enough money to keep the 
company in the black. 

In this situation we note that at the point of maximum profits the slope 
of the graph is negative (dR/dQ < 0) so that a further decrease in output, 
if it were possible, would appear to be called for. But, since such a reduc- 
tion is not possible, we must be satisfied with a level of production at which 
the calculus maximization criterion fails: dR/dQ, i.e., the marginal profit 
yield of Q is not equal to zero. 

A similar difficulty occurs in somewhat more striking form in the situa- 
tion shown in Figure 3b. Here we assume that the firm has a limited output 
capacity and that it can therefore produce no more than quantity OQ;. In 
this diagram there is a level maximum point M, at which the derivative 
maximum condition dR/dQ = 0 is satisfied. However, this maximum is 
economically irrelevant because the firm cannot attain it even though it 
might well like to do so. The maximum profit feasible point is, in fact, C, 
where dR/dQ > 0, so that the derivative maximization rule is violated. 


The two cases so far considered in which this rule breaks down can be 
handled effectively only by means of a totally new approach for the deter- 
mination of optimal values of the variables. In the simple examples shown 
in Figures 2 and 3 the optimal values of z and Q are, of course, obvious on 
inspection. Particularly in Figure 3b the answer seems easy—to maximize 
its profits the firm should produce as much as it can. But when the number 


54 Maximization, Minimization, and Elementary Differential Calculus Chapter 4 


of variables involved is considerable, as it usually is in practice—for ex- 
ample, when the firm is dividing up its limited productive capacity among 
many hundreds of products—the answer is far from obvious. 

To deal with such optimization problems (frequently encountered in 
economics) where the marginal maximization condition fails, a new body 
of analysis, called mathematical programming, has been developed. This 
analysis, which has turned out to be of very great significance for eco- 


nomics and business decision-making, is discussed in considerable detail in 
the next four chapters. 


5. Second-Order Conditions of Maximization and Minimization 


A totally different but less serious sort of difficulty for the calculus 
maximum rule is illustrated in Figure 4. It will be observed that the con- 
dition dR/dQ = 0 is satisfied not only at the maximum point A. It also 
holds at point B where profit is at its minimum (!) and at any point in the 
level stretch CD. Except in the mathematical programming cases dis- 
cussed in the previous section, we may conclude that wherever profits are 
maximized dR/dQ = 0, but the converse is not true: We may be at an 
output such as OQs, where dR/dQ = 0, and yet profit will not be maximized 
at that point. 

The source of the difficulty is easily seen. The marginal maximization 
condition dR/dQ assures us only that we are at a level stretch on the profit 
hill—we are neither going uphill nor downhill. But being on a piece of 
level ground is obviously no guarantee that we are on top of a hill. 

To take care of this difficulty we need some more information. We re- 
quire another condition (called a second-order condition) which assures us 
that we have just stopped going uphill, and that if we go any further we 
will begin to descend. If this is true, and if we are at a level point, we must 
clearly be at the top of a hill. This elementary and apparently trivial argu- 
ment lies behind a considerable body of relatively deep analysis. 

Unlike the problems of the preceding section, our present difficulties 
can normally be taken care of with the help of the differential calculus. 
The second-order condition which has just been described is essentially a 
requirement about the behavior of the slope of the curve. The (first-order) 
condition dE/dQ = 0 states that the slope must be zero at a profit-maxi- 
mizing point. The second-order condition states that the slope must pre- 
viously have been positive, i.e., that there dR /dQ > 0 (we must have been 


ele 
o 
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going uphill as we moved toward point A from the left) and that thereafter 
the slope must become negative (further movement to the right, ie., 
further increases in production after output level OQ, must reduce total 
profit). In sum, the second-order condition requires that dR/dQ fall as 
output increases. These two requirements are summarized graphically in 
Figure 5. Here, rather than representing total profit on the vertical axis, 
as we did before, we have instead measured the marginal profitability of 
output, dR/dQ, along that axis. The first-order condition, then, requires 
that the graph cut the horizontal axis (€R/dQ = 0) at the profit-maxi- 
mizing output OQ 4. The second-order condition requires that the slope of the 
marginal curve at that point be negative, so that dR/dQ will be positive to 
the left of Q4 and negative to its right. 

This second requirement is a condition which refers to the behavior of 
the curve in Figure 5. It states that the slope of the dR/dQ curve must be 
negative. But dR/dQ is, in turn, itself a slope—the slope of the profit 
graph. Hence our second-order condition is a statement about the slope of 
a slope. It involves what is called a second derivative. 

The process used to finda second derivative (written d'y/da? or y") 
is a simple repetition of that used to find the first derivative. We just 
differentiate (to find a slope) and then differentiate again (to find the 
slope of the slope curve). For example, given y = 4x3, we know that 
dy/dx = 122? and (by repeated differentiation) d?y/dz? = 242. 

In sum, the second-order rule states: To find the maximum value of 
any relationship between two variables, y and z, compute dy/dx and 
determine the values of z for which this has the value zero. If for any of 
these values of z we also find that d?y/dx” is negative (dy/dz falling, as in 
Figure 5), then this is a true maximum point. 

Minimization, too, can occur in an optimality calculation. For example, 
instead of maximizing profit we may wish to find the output level which 
minimizes total cost. The preceding rule is easily modified to deal with 
this sort of problem. The reader should convince himself that the only 
required change in the second-order condition is that in minimization the 


second derivative must be positive.’ 
Example: Find the maximizing value of z for 
y = 100+ 12z — 7. 
Here dy/dz = 12 — 32”, so that if dy/dz = 0 we must have 
12— 332 — 0,  ie,322— 12, w= 4, z- +2. 
7 Actually, slightly weaker conditions will do in both cases. If the second derivative 
is zero but the fourth derivative d^y/dz* is negative, the point in question is still a 


maximum; if this derivative is positive, it is a minimum. A similar result holds for the 
case where the fourth derivative is also zero but the sixth is not, and go on. 
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To find which of these two numbers (if either) yields a maximum value of y, we 
find that d?y/dz? = —6z, which is negative for z = +2 and positive for z = —2. 
Hence z = +2 yields a maximum value of y: 


100 + (12)(2) — (2)? = 116 
and z = —2 yields a minimum value of y: 


100 + (12) (—2) — (—2)? = 84. 


One more important warning is still required. There is a pitfall in the 
definition of the words “maximum” and “minimum” as used in ordinary 
discussions involving the differential calculus. A maximum is used to 
denote the top of a hill in the graph of the function. But the graph may 
contain several hills (each is then called a local maximum) and the calculus 
procedure as described offers us no guarantee that we have found the 


Y¥=100 +12x-x> 


Figure 6 


highest hill (called a global maximum). Moreover, the graph may contain 
higher points which are not hilltops. This was in fact the case in the last 
example whose graph is shown in Figure 6. It is to be noted that our (local) 
maximum point A is below points such as B and our (local) minimum 
point C is above points such as D. In this case no global maximum or 
minimum points exist because y keeps going downhill indefinitely as we 
move to the right beyond A and it rises indefinitely to the left of C. Hence 
it is pointless to look for a global maximum or minimum here. But even 
in a problem in which there are a number of local maxima, one of which is a 
global optimum, the differential calculus methods just described will not 
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do the trick. However, mathematical programming has made some progress 
in dealing with this problem. 


PROBLEMS 


1. Find the second derivatives of 
(a) y = 825+ 15x 
(b) 2 log 4z. 


2. (a) Does y = 50+ 902 — 52? have a maximum or a minimum? What is the 
value of z at that point? 
(b) How about y — 15z?? 


$. Maximization in Many-Variable Relationships: Partial Differentiation 


Usually more than two variables will be involved in an economic rela- 
tionship. For example, total profit R will depend not only on the level of 
output Q. It will also depend on the firm's advertising expenditure A, on 
the price P charged for some competing product, and so on. If these four 
variables, R, Q, A, and P, were the only ones involved, we would write 


ER = fQ, A, P), 


which is read, ^E is a function of Q, A, and P." This means only that the 
level of total profit depends in some specified manner on the levels of the 
firm's output, its advertising expenditure, and the price of the competing 
product. 

Given such a multivariable relationship, we may again ask about the 
effect of a change in Q, A, and P on total profits. In doing so we investigate 
the marginal profit effects of a change of one or more of these variables. 
In particular, we may wish to see what happens when we vary the value 
of one of the variables and the values of the others do not change from 
given amounts. As is to be expected, there is a form of the derivative which 
corresponds to such an “other things being equal” marginal concept. It is 
called the partial derivative. For example, if we wish to examine the effect 
(at a point) of a change in advertising expenditure on total profit for any 
specified and unchanging levels of Q and P, we compute the partial de- 
rivative of total profit with respect to advertising expenditure, which is 
written 0R/dA. 


The procedures for partial differentiation are a simple and intuitively 
comprehensible extension of the ordinary differentiation process. We just 
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treat as constants all variables other than the two which are directly in- 
volved. This is the natural interpretation of the idea that the values of all 
other variables are held constant. An example will make this clear. Given 


y = 5z* + 55r log z + 3zz — 122 + 4 


to find dy/dx we treat z as a constant so that the relationship may be 
rewritten as 


y = 5z* + (55logz)z?-4- (3z)a — (122 — 4). 


None of the terms in parentheses contains either a y or an z, so that each 
such term is treated as a constant number, i.e., for purposes of partial 
differentiation our relationship is treated like the equation 

y = añ +b 4 ez Fd 


where a, b, c, and d are constants. Thus we have 


x = daz? + 3bz* + c = 202? + 3(55 log z)z* + 3z. 
E 
The term d = — (122? — 4) drops out in partial differentiation with respect 
to x because the derivative of a constant is always zero. 

Employing the partial differentiation concept, we can now extend our 
discussion of the caleulus minimization and maximization criteria. We 
Shall discuss only the first-order conditions because the second-order con- 


ditions are rather complex in this case and we defer their discussion to 
Chapter 13.* 


Figure 7 is a geometric representa- 
tion of a three-variable case. Here the 
graph of the profit relationship is a 
three-dimensional hill. Any point on 
the “floor” of the diagram represents a 
pair of values of Q and A. For example, 
point K represents a situation in which 
the firm decides on output level OQ, 
and on advertising expenditure OA,. 
Moreover, the profits to be earned by 
this output-advertising expenditure 
combination are represented by the 
length of the vertical line KK’ from 
point K to the point K’ on the profit 
surface directly above it. 


® These conditions have in fact played an important role in comparative statics 
analysis. The conditions are discussed and used in the Mathematical Appendix to 
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Here the profit-maximizing output-advertising expenditure combina- 
tion is obviously M, and it will be noted that at its highest point, M’, the 
profit surface is again level. This means that the slope of any cross section 
is zero at that point (for otherwise we would be going uphill or downhill 
and hence we would not be at a maximum). In particular, this means that 
as we move directly toward the right (increase Q and hold A constant) we 
must find the profit surface level, i.e., we must have 0/0Q = 0. Similarly, 
a small move directly toward the rear of the diagram (increasing 4 and 
holding Q constant) must also encounter à level profit surface so that 
ðR/ðA = 0. 

More generally, in the n + 1 variable case, y = f(®1 +>, Zn), if we 
are to be at a maximum point, we must have the n relationships 


dy/dxz, = 0, dy/dx2 = 0, OP dy/dx, = 0 


so that no small change in value in any of the variables zi, Ta ***, zs will 
increase y. These are the first-order maximum or minimum conditions in 


the many-variable case. 

The way this helps us to find the maximizing or minimizing values of 
the variables xi, Ta, ***, n İS now straightforward, at least in principle. 
The conditions dy/dx = 0, ete., are n equations in the n unknowns; and 
if they can be solved simultaneously for the values of the x’s, they will 
yield the maximum or minimum values (if the appropriate second-order 
conditions are satisfied). 


Example: Assuming that the second-order conditions are satisfied, find the 
profit-maximizing values of Q and A. Given the relationship 


R = 400 — 3Q! — 4Q + 2QA — 54? + 484 


we take 02/0Q and ðR/JA and set them equal to zero to obtain 


en — 4+ 2A = 0, ie, —6Q4- 2A = 4 

ðR " 

22 =29—10A+48=0,  ie,2Q— 10A = —48. 
ðA 

To eliminate the terms containing the A’s, multiply the first equation by 5 to 


obtain 
—30Q + 10A = 20 


ee ae 
J. R. Hicks’ Value and Capital, 2nd Edition, Oxford University Press, New York, 1946, 
and in Chapter IV and Appendix A of Paul A. Samuelson’s Foundations of Economic 
Analysis, Harvard University Press, Cambridge, Mass., 1948. For a good introductory 
exposition see James M. Henderson and Richard E. Quandt, Microeconomic Theory, 
MeGraw-Hill Book Company, New York, 2nd ed., 1971, especially Chapters 2 and 3. 
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and add this to the second equation to yield 
—28Q =—28 or Q=1. 
Substitution of this value into the first equation gives 
—8»14-424-46 or 2A- 10; sothat A= 5, 


Hence Q — 1 and A — 5 are the output and advertising expenditure levels which 
maximize total profit. 


PROBLEMS 


Which values of the variables satisfy the first-order maximum or minimum 
conditions in the following relationships? 


1. R = 737 — 5Q? + 22A + QA — 44A? + 179. 
2. y = 83 4 + r? + 26r — 6zz — 30z + 22. 


7. Total Differentation 


Total differentiation is a natural extension of the idea of partial 
differentiation. If we have an n-variable function, y = f(x, $9, 55, 25); 
then the partial derivative dy/dz, is the effect of a small change in x; on 
y all other things being held equal, i.e., it is the effect on y of a change in 24, 
(hypothetically) holding z5,--., z, constant, whether or not in reality 
these values are affected by zı. Thus, we may ask about the effect of a rise 
in the price charged by a firm, zı, on the demand for its own produets, y, 
on the assumption that competing firms were not to change their prices in 
response. But in fact competitive price responses do sometimes occur, so 
that if we want to determine the final net outcome of the chain of events set 
off by a change in z1, we cannot stop with the corresponding partial deriva- 
tive. We must instead use an expression, the total derivative, that takes into 
account not only the direct influence of the change in zı on y but also its 
indirect influence via its effects on the other variables, Boy eae y Zn. 

The formula for the total differential of y is 
3y 


dy — 
y Oz, 


dy ay 
d: == ET a a 
zı + GEFA drat t Oz, dz, 
or dividing through by dzi, we obtain the formula for the total derivative 
of y with respect to x, (recalling that dz,/dz, = 1), 


dy _ Oy , y dra | , OY dz, 


dz, 02, ðt dz, E Oz, dz, ` 
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This fundamental result will be used repeatedly in the book. An intuitive 
interpretation of these formulae is not difficult. The second of these equa- 
tions, for example, states that the effect of zı on y is composed of two parts: 
Its direct effect, dy/8x1, plus the sum of its indirect effects via all other 
variables z;, each such indirect effect being given by (8y/z;) (dz;/dz1), the 
product of the effect of zı on z; and of z; on y. This equation almost strikes 
one as too simple to be true. In effect it states that to find the total effect 
of zı on y one takes its direct effect and each of its indirect effects via other 
variables in turn and simply adds them up. 

It is quite easy to indicate the derivation of these formulae. Dealing, 
for simplicity of notation, with the two-variable case y = f(zi, 22), we 
obtain the formula for the total differential as follows: 

By definition 

ay = f(r + Azi, 2+ A12) — J(zi, 22) 


or simultaneously subtracting and adding f(r, 22 + Ate), 


Ay = [f(zi + Ati, t2 + Ate) — f(i, 22 + Az2)] 
+ [f(a1, z2 + Ave) — fer, 22)] 


fe + Amy T2 + Ate) — f(zi, 22 + Ata) id. 
= Gu ae VRPOP NM 


f(y, 22 + Ave) — f(@1, 22) 
+ Ass ————7 Azz. 


But, by definition, the limit of the second fraction, as Az» approaches 
zero, is 0f/dr2, and the analogous interpretation holds for the first fraction. 
Consequently, taking limits as Az; and Azs simultaneously approach zero, 
Ay approaches dy, etc., and our last equation approaches 


dy = (af/dx1) dz, + (8f/022) dee, 


which is our desired result. 


PROBLEM 


1. Find the total differentials of the following: 
(a) y = 2zizi 
(b) y = 221 + 42i 
(©) y = f&n 22) + zi 
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8. Constrained Maxima: Lagrange Multipliers 


We have already come across a number of cases in which the range of 
variation of the variables was restricted. Maximization or minimization in 
such cases becomes a problem of finding the largest or smallest values 
which can be achieved within the permitted ranges of the variables. A 
problem of this sort is said to be one of constrained maximization, and the 
relationships which restrict the range of variation of the variables are 
called constraints or side conditions. Problems involving constraints occur 
frequently in economics, as we shall see in the next chapter. 


The relationship between a constrained and an unconstrained maxi- 
mization problem can be illustrated with the aid of a geographic analogy. 
If we seek the location of the highest point on earth, we will end up with 
the latitude and longitude of the peak of Mount Everest. But if this altitude 
maximization problem is constrained by the condition that we must remain 
within the continental limits of the United States, two changes will occur: 
(1) we will end up with different latitude and longitude numbers, and 
(2) the height of the maximum point will be decreased. Of course, if the 
constraint had instead only required us to stay within the 
rather than the United States, our original (unconstrain 
have remained valid. Hence we conclude that a constrain 
need not always, change the values of the 
usually will (but need not) decrease the value o 
and, at best, it will leave that value unaffected 

Let us now consider a a 
(thousand) dollars to spe 


Asiatic continent, 
ed) answer would 
t usually will, but 
“independent variables”; it 
f the item being maximized, 
pecific maximization problem. A firm has 100 


nd on labor and raw materials in the next year. 
Let L be the quantity of labor it hires and let its (annual) price per unit 


be 2 (thousand dollars). Moreover, let the quantity of raw material 
bought be M, and let it have a price of 1 (thousand dollars) per unit. Then 


the firm is operating under the budget constraint that its total expenditure 
on these two items be 100, i.e., that 


2L + M = 100. 


For subsequent reference it is important to note that this co 
equation. Suppose, moreover, for purposes of illustrative 
the firm's output Q is related to L and M via the 
production function: 


nstraint is an 
simplicity that 
following improbable 


Q = 5LM. 

To get as much output as possible out of its budget the 
the values of L and M which maximize Q but w 
constraint. Again, in this section we deal only wit 
One fairly straightforward way of solving thi 
problem is to use the constraint to eliminate on 


firm must find 
hich satisfy the budget, 
h first-order conditions, 
S constrained maximum 
€ of the variables. Thus 
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2L + M = 100 gives M = 100 — 2L, and substituting this expression for 
M in the production function Q = 5LM gives 


Q = 5L(100 — 2L) = 500L — 10L%. 


"We are then left with an ordinary unconstrained maximization problem 
involving only the two variables Q and L. This is solved, as before, by 
setting the first derivative equal to zero to determine the value of L, i.e., 
by solving 

500 — 20L — 0, 


which gives L = 25. Now, substituting this into the budget constraint 
equation, we find that M = 109 — 2L = 50. This, then, is the solution. 
It will pay the firm to obtain twenty-five units of labor and fifty units of 
raw material. 

Before discussing a second and far more powerful method of solving 
such constrained maximum or minimum problems, let us examine the 
problem geometrically. In Figure 8a the straight line BB’ is tht. graph of 
the budget constraint 100 = 2L + M. This graph shows how the budget 
constraint restricts the range of values of the variables L and M. It does 
not just set fixed upper or lower limits on their values. Rather, it states 
that only certain combinations of these values are admissible, i.e., those 
which satisfy the budget equation so that (like combination L, and M.) 
they are represented by points on this line. In economie terms, only com- 
binations of labor and raw material quantities whose total value is equal 
to 100 are to be considered. All other points such as C or D represent input 
combinations which do not meet the firm's budget requirements. 

Now, suppose we place Figure 8a flat on the floor and erect over it à 
graphie representation of the production function Q = 5LM. This is done 
in Figure 8b in which surface OUVW represents the production function, 


Figure 8 
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ie., for each combination of inputs L and M it shows how much will be 
produced. 

In the constrained maximization problem we are not interested in the 
entire production surface. We consider only that part of the surface which 
corresponds to admissible input combinations whose locus is line BB’ on 
the floor of the diagram. The locus of the corresponding outputs is the arc 
BH'B' above line BB’. The optimum point which we seek is clearly point 
H, which lies below the highest point on this arc, i.e., it yields the highest 
attainable production level, HH’. 

The method of solution which already has been described involves our 
using the budget constraint to eliminate one of the variables. This is 
tantamount to our taking a cross section of the diagram which contains 
both line BB’ and arc BH'B'. In this way one of the dimensions is elimi- 
nated from the diagram and we are left with an ordinary maximization 
problem in two dimensions (variables). 

Unfortunately, the method does not always work. The constraint does 
not always take the simple form of our budget equation, so that it is not 
always possible to eliminate one of the variables directly. For example, if 
(in some nightmare) we encountered the constraint 


L” log LM Er 
VA—LUMS C 
we would find it difficult to solve for M in terms of L as we did before. 
Such cases must be dealt with by the method of Lagrange multipliers, which 
is, in any event, of far greater theoretical interest. Rigorous justification 
of this procedure is beyond the scope of this book but it will be described 
and explained intuitively. 
It will be recalled that in the unconstrained maximization problems 
we proceeded by differentiating partially with respect to each variable in 
turn and setting each of these partial derivatives equal to zero. This gave 
us as many equations as variables and normally these could be solved for 
the optimum values of the variables. This method usually breaks down 
when there are constraint equations in the problem; for, in addition to the 
"partial derivative equal to zero" conditions, the constraint equations 
must also be satisfied. This means that the problem contains more equa- 
tions than unknowns and is therefore, 


in normal circumstances, over- 
determined.? To get out of this difficulty we introduce some artificial un- 


knowns, as many as there are constraints, to increase the number of 
unknowns to equality with the number of partial derivative equations and 


? It is not true that equality of the number of equatiüns and unknowns either guar- 
antees or is necessary for solvability of a system of sii 


multaneous equations. However, 
there is some presumption that this will be s0. See below, Chapter 23, Section 1, 
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constraint equations. These artificial unknowns are called Lagrange 
multipliers. 

Let us rework our example to show how the method works, meanwhile 
offering some justification for the procedure. First we take our constraint 
2L + M = 100 and bring all the terms over to one side of the equation 
to obtain 

2L + M — 100 = 0. 


Next, we multiply the resulting expression on the left by an unknown 
constant A!9, and add the result to the production function Q = 5LM to 
obtain the so-called Lagrangian expression 


Qu = 5LM +r(2L + M — 100). 


The basic point is, roughly, that if the constraint is always satisfied the 
expression in parentheses will be equal to zero so that the Lagrangian 
expression Q, will behave exactly in the same way as does the production 
function. Whatever values of L and M maximize the one will automatically 
maximize the other. But the Lagrangian expression contains three symbols 
whose values are unknown, namely ^, L, and M. We may then solve the 
problem by differentiating partially with respect to each of the three un- 
knowns, set the three results equal to zero, and solve these three equations 
for thc three unknown values. This yields 


aQ 

9^ _ 5M +2=0 
aL 5M + 

9Q. 

OVA a i = 0 
aM 5L + 


"m =2L+M — 100 = 0 (which is the budget constraint equation). 
Now the reader will note that when we set the partial derivative with 
respect to ^ equal to zero we automatically guarantee that our constraint will 
be satisfied. This was arranged for when we multiplied ^ by the constraint 
expression in the Lagrangian function Q,. But, as we have seen, since the 
constraint is satisfied (2L + M — 100 = 0), we must have Q, = Q and 
the solution to the Lagrangian problem then necessarily solves the original 
problem as well. That, roughly, is the rationale of the Lagrangian method. 
Our three equations can be solved by multiplying the second equation 
through by 2 and subtracting from it the first equation to obtain 10L = 5M 
or M = 2L. Substituting this into the last equation we obtain 4L — 100 
or L — 25, and another simple substitution yields M = 50, à = —125, 


10 3 is the Greek letter lambda, which is usually used for this purpose. 
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the same result as before, except that, in addition, we have now obtained a 
value for ^. a 

It can be shown that this value of ^ itself has a significant economic 
interpretation. In this case, —2 is the marginal productivity of money—it 
indicates how much would be added to output if the input budget were 
increased from 100 to 101. We can check this roughly by noting that this 
dollar could be used to buy another unit of M , Which would increase 
output from 


Q = 5LM = 5 > 25M = 125M to Q + AQ 


125(M + 1) 
125M + 125 


so that 
AQ(= AQ/1 = AQ/AM) = 125 = —X. 


More will be said about the interpretation of Lagrange multipliers in a 
later chapter. 

Finally, we note that if the problem has more than one constraint 
equation, to obtain the Lagrangian expression we multiply each of the 
constraints by a different unknown Lagrange multiplier and add them all 
to the original expression whose value is to be maximized. 

Example: Maximize 

y = 10zzw — 3w? 
Subject to z + z + w = 12 and z — w = 2, 


We rewrite the constraints as z + z+ w — 12 = 0 and z — w — 2 = 
multiplying these, respectively, 
sion 


0, and 
by \ı and A» we can write the Lagrangian expres- 


Yr = l0zzw — 3w? + (z+ 2+ w— 12) + X(z — w— 2). 
The maximum (or minimum) value is then found from the equations 
dyn 


5 = 100zw - M c -X4— 0 
oz 


à 
S m Wswicke za 0 


En 

ð 

PEE 65--M—X-90 

ðw 

ð 

pez ttetw—i2=0 
and aU Se wi 2 = 0, 


where the last two lines are the constraint equations, 


Vipin noiipzimicihs 
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i1 0WOlla2 asd nhe nad? aved Pe f 
i VO PROBLEMS ^ 190} vss ej2tmioron.d 


1. By both methods find the ‘values which ‘satisfy the"fi&st-order conditions for 
maximization of 


(a) y = l0zw — 2u? subject to x+ w = 12 1 
(b) R = 737 — 8Q2 -- 144 P QU 441247909 NBE to 


2Q+ A—2. 
2. Write out the Lagrangian expression for the problems:maximize 
y = log z^w 


subject to 
4 uf ITSO 
; E 
coszcosw = 0.3 and pr +e" = 10. 
y reread tec 


TI i 383 an 


9. Some Economic Applications of the Differential Calculus 


In economies the differential ealeulus has had many fruitful applica- 
tions. In fact, as we have already noted, economists have invented a spe- 
foe Hus technique, referring to it As marginal’ analysis. 


cial terminology 
This application arises naturally in an investigation of the decision-making 
c units. For, in pursuing 


of business firms, consumers, and other economi 
their goals, ¿hese units may be taken to maximize some measure of achieve- 
ment, whether it be profits, national income, or some other such variable. 


It is convenient at this point to list some of the functional relationships 


which recur mos: frequently in the work of the economist: 


(1) a production function;Q = f(z), which records how the required 
raw material, z, varies with the production level 


quantity of labor or some 
ced 


Q'of'some commodity; e.g. Q may represent the number of shoes produ 

loper week: by/arshoelfactoryzor lsdnsmsbru) idi bast Jostig ton 

(2) a cost function C = tos. hich records the total expense C as- 
"aoéihted'withzproduetion leveliQ;- s sone i» :I slqusxá 

(3) a demand function P = F(Q), which shows howshigh@ price P 

can be charged per unit ifciteis desired-to'sell Q units of a commodity. In 


much of the commodity consumers will demand 
HIT .noiloubo1q ybisow ods ei O s19dw 


ino oft); 2 Fe hi Shows thé total income ' 
of the ity at the pri y n 
tt MAU 


n 

(5) a utility ire that 

dividual derives from the. possession of some quantity Q of 
CUO Ty Oe = 9 


commodity. 
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Economists have then adopted the following terminology: 


? E dU 
marginal utility is the name given to dQ 


di 
marginal product refers to z 


marginal cost refers to 6 


d(PQ) 

Q 

Suppose now that a businessman desires to earn as much total profit, 
R, as he can. This means that he desires to make as large as possible the 


difference between his total receipts (revenue) PQ and his total costs C. 
In other words, he seeks to maximize 


marginal revenue refers to 2 a A 


R = PQ — C. 


We can now see what level of production Q is most profitable. In ordinary 
circumstances, E will be maximized when its derivative vanishes, i.e., 
when 

ak APO) de 

dQ Q dQ |. 


or d(PQ) dC 


dQ — dQ” 


In other words, maximum profits require that marginal cost be equal to 
marginal revenue. This is a fundamental result in the economic theory of 
the firm, which will be discussed again later. 


Example 1: Optimal production level. Su 


ppose the relevant portion of the de- 
mand function is 


P = 100 — 0.019, 


where Q is the weekly production. This equation states that as more of the com- 
modity is put on the market its price must fall. If price is measured in cents, it 
means that price must fall by 1 cent for every 100 additional units of the com- 
modity which appear on the market each week. Suppose also that the cost function 
is given by 

C = 50Q + 30,000. 
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It is easy to verify that " 
totalrevenue — PQ = 100Q — 0.01Q? 


d(P 
marginal revenge = ( x = 100 — 0.029 


marginal cost — = = 50 
dQ 
so that maximum profit involves 
marginal cost — marginal revenue, ie. 50 = 100 — 0.029. 
This means that the most profitable level of production will be į 
50 

= 0.02 = 2,500 units per week. ’ 

At that level of production, price will be I 
P — 100 — 0.01Q — 100 — 25 — 75 cents 


and total profit per week will be 
R= PQ— C= 75 X 2,500 — (30,000 + 50 X 2,500) cents = $325. 
Example 2: The incidence of a sales tax. Suppose, in the preceding example, 
the government decides to levy a tax of 10 cents per unit of product sold. What 
will happen to price, quantity sold, and total profit? 
Total cost now becomes 
C = 50Q + 10Q + 30,000 = 60Q + 30,000. 


Maximum profit again requires marginal cost to equal marginal revenue, which 


now involves 
60 = 100 — 0.029. 


This yields 
Q — 2,000 units per week 
P = 80 cents 
R — $100 


We thus have the results 


Before Tax After Taz 


Tax/unit 0 10 
Weekly output 2,500 2.000 
Price (cents) 15 4 80 
Weekly profit ($) 325 100 


Particularly noteworthy is the result that it does not pay the businessman to pass 
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on the full 10 cents tax rise to the consumer—rather it is most profitable in this 
case to raise his price by. only 5 cents! j 


Example 3: Fixed costs. Let the firm's total cost be given by C = k + c(Q). 
That portion of the firm's costs, k, which do not change when its output Q changes 
is called its fized cost. The firm's marginal cost is obtained, by differentiation, to be 

,dC  de(Q) 


dQ dQ 
It will be observed that the constant term, k, has dropped out in the course of this 
differentiation. As, a result, marginal costs are the same no matter what the value 
of k, i.e., no matter what changes occur in the firm's fixed costs (cf. Section 6 of the 
previous chapter). # BovoUborg 1o faval s! 

This leads us to a rather surprising result. Since a change in fixed costs-does 
not affect the firm's marginal, cost. (nor, clearly, its marginal revenue), the price- 
output combination at which marginal cost equals marginal revenue will be un- 
affected by any such change. In other words if the firm maximizes its profits, a change 
in fixed cost will affect neither its output level nor the price of its product!" 


1295 Gy = 


PROBLEMS 

1. Give verbal definitions of marginal utility, marginal cost, marginal revenue, and 
.marginal product. ' : i j 

2. K commodity is sold at a fixed price P per unit. A consumer, in buying Q units 
of the commodity, tries to maximize the difference between the utility U = f(Q), 


which he has to pay; Show that4o maximize U — PQ — fi (Q) — PQ he should 
buy so much of the commodity that its marginal utility is equal to its price. 


"d. Elasticity of demiátd is & measure of the fesporisiveness of quantity demanded 


to price changes, which is given by 


% change in quantity demanded . 100dQ/Q P dQ 

% change in-price, -nguy 100dP/P  QdP' 
If the demand function Q — F(P) is such-that when price is cut purchases 
increase by an amount just sufficient to keep total revenue unchanged, the 
demand equation is PQ — K (a constant). Show thàt in this case elasticity of 
demand = —1. í 


4. Let c = g(Q) be the cost per unit. of producing a commodity, C the total cost 
of producing that commodity, and Q the number of units produced. Then, by 
definition, C = cQ. Prove that if Q is at such a level that costs per unit are at a 
minimum (is this an efficient level of production?); we will also have marginal 
cost equal to c. 


11 The rationale of the rule that the profit maximizer’s prices should not be changed 
when fixed costs ‘change is discussed in Chapter’15, below. 
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5. Suppose total cost C = 120Q — Q? + 0.020? and P = 114 — 0.25Q. 
(a) What level of Q yields minimum costs per unit c? 
(b) Does this level of output yield maximum profit? 
(c) At how many levels of output is marginal cost equal to marginal revenue? 
(d) Are these all profit-maximizing outputs? (Evaluate the second derivative 
of the total profit.) 


6. Consider the demand and cost functions 
P=a— bR 
C = w+ wQ, 


where .a,.5, 10, and. v.are positive constants. Suppose the government imposes 
on the producer a tax of ¢ dollars per unit of output and that as a result it pays 
to raise price from P dollars to P* dollars. Show that P* — P = $t i.e., that it 
pays the manufacturer to shift only one-half of the tax onto the consumer if 
he is to maximize his profits. 

7. Find the sign of the second derivatives of the profit functions-in the preceding 
problem. Why are they relevant? 
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Linear Programming 


5 


Programming, both linear and nonlinear, is entirely a mathematical 
technique. Its economic content is therefore nil. This is no mere classi- 
ficatory quibble. It means that programming per se can never tell us any- 
thing about the operation of any part of the economy. Like the calculus 
or any other branch of mathematics, it can only help us to find the im- 


in areas outside their own fields, as they have been in the past when they 
formulated the largely technological law of diminishing returns, or when, 
by inventing the marginal analysis, they stumbled, a few centuries too 
late, on what is essentially a crude version of the differential calculus. 


1. Some Standard Programming Problems 


Programming is concerned with the determination of the optimal solu- 
tions to problems. As a result, it is well suited to the analysis of rational _ 


! Several economists have made important contributions. Notable among these are 
T.C. Koopmang, R. Dorfman, and W. W. Cooper. But if any one person is to be named 
as the father of programming, we must undoubtedly award the honor to mathematician 
George Dantzig, inventor of the first successful (and still one of the most efficient) gen- 
eral computational techniques, the simplex method. Important contributions have been 
made by mathematicians such 28 the Russian L. V. Kantorovich, who first formulated 
the problem, H. W. Kuhn, A. W. Tucker, A. Charnes, and others. 
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behavior. It has, therefore, like the marginal analysis, been somewhat less 
successful in describing what is than in indicating what (given some pre- 
assigned goals) ought to be. Some of the most fertile applications of pro- 
gramming have involved welfare economics and advice to businessmen, 
both of which aim to tell the relevant persons how they can most efficiently 
go about working toward their objectives. Let us indicate briefly a few of 
the business problems to which programming is most frequently applied. 


(1) Optimal product lines and production processes. When operating at & 
high output level a firm is likely to run into a variety of capacity limita- 
tions. Its factory size, the amount of time available on different machines, 
its warehouse space, and its skilled personnel—any or all may constitute 
bottlenecks, some of which are prohibitively expensive or even impossible 
to eliminate in the short run. 

A crucial characteristic of such a situation is that the production of a 
relatively unprofitable item or the use of a production process which makes 
liberal use of the scarce facilities may take up valuable capacity that can 
better be used in more economical processes and in the manufacture of 
more lucrative commodities. 

There is no simple solution, such as complete specialization in the one 
“most efficient process” for producing the one item which makes “most 
profitable” use of scarce facilities since, except by pure accident, there may 
be no process or no product which is economical in its use of all of the 
firm’s limited facilities at once. One item may make good use of machine 
capacity and may therefore yield the highest profit per scarce machine- 
hour, whereas another may make more effective use of limited warehouse 
space. Production of only the former would find warehouses completely 
loaded before machine time was fully employed, while the latter product, 
since it is not bulky, might leave warehouses half empty even if the firm’s 


machines were to turn out nothing else. 
(2) Transportation routing. In the selection of transportation routes, 


especially where a firm has many plants and its processes involve trans- 


shipment of items in various stages of production, substantial savings can 
dity movements. If the firm 


be expected from careful planning of commo ; 
employs its own trucks or other transport facilities, the problem is to route 
them in a way that incurs as little cost as possible. Where the firm employs 
others to do its transporting, the computations may be further complicated 
by peculiarities in the transportation rate structure, for then the firm’s 
objective is not to minimize ton-miles but to minimize payments to the 
carrier, and the two do not always correspond. 

Many contracts include a number of 
the product, and sometimes 
r himself. Usually there is a 
n be met. For example, an 


i Meeting product specifications. 
the = m specifications which must be met by 
variet; nufacturer will set up such standards fo’ 

ety of ways in which these specifications cà 
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animal feed may require X units of protein per bag, Y of carbohydrates, 
Z of vitamin B, etc. Each of the grains combined in the animal feed con- 
tains some of the nutrients, and it is therefore possible to make a bag of 
feed meet these specifications in many different ways. A very inexpensive 
ingredient may contain much starch and very little else, so to meet the 
standards it may be necessary to add some more expensive ingredients. 
But which ingredients should be added and in what proportion? Or will 
it prove cheaper to begin with somewhat more costly ingredients which 
supply a better balance of all the nutrients? 

The least-cost combination of meeting specifications is basically a 
programming problem. This technique has, for example, been employed 
in just this fashion, i.e., in mixing animal feeds, as well as other areas such 
as.in the blending of gasolines. Programming techniques have been em- 
ployed in many other business problems. They can help determine optimal 
inventory levels and have been used to solve production problems such as 
in the cutting of paper and cloth in a way which minimizes raw material 
waste and in the job assignment of specialized personnel. 


2. Characteristics of Programming 

What is the common element in all of these situations which makes 
them amenable to programming analysis? It is clear that all of them re- 
quire a search for “best” values of the variables. But there is something 
more involved which makes the usual tools of the calculus or marginal 
analysis inapplicable. In many problems of optimization there is a com- 
plication in that the outcome, to be acceptable, must meet certain condi- 
tions. For example, the problem of fencing in 20 square feet at minimum 
cost involves the determination of that shape of plot which will save on 
fencing most effectively. But any saving which is achieved by fencing only 
19 or 21 square feet will be unacceptable. This, then, is essentially a problem 
of finding the best way of meeting a very precise specification which the 
mathematician calls “a side condition.” So long as the specifications are 
so precise (the area must be 20 square feet, no more or less, or the starch 
content of a 100-pound bag of feed must be exactly so many calories, etc.), 
the optimization problem can usually still be dealt with by calculus (mar- 
ginal) techniques, as was shown in Section 8 of the previous chapter.” 

However, it is characteristic of many business problems that specifica- 
tions are not precise but provide only minimum requirements that must 
be met. Or the specification, rather than stating the precise extent to 


? Even here there is an important exception. The mathematical form of the precise 
specification (side condition) is an equation. If the graph of the equation is discontin- 
uous or kinked, caleulus methods cannot, be depended on to work. The reason is that 
these techniques find an optimum by computing the slope of the relevant graphs io in- 
vestigate whether it is possible to go “uphill” (toward higher profits). Where the gespb 
of a function is discontinuous or kinked, its slope is, for obvious reasons, not even defined. 
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which a facility will be used, may indicate only the maximum capacity 
which is available. Any output which overshoots the quality standards or 
does not fully utilize some part of capacity is not necessarily ruled out. 
Here the side conditions are inequalities rather than equations. That is, 
they do not state that X must equal 500 but only that X must be no less 
than 500. 

This sort of side condition characterizes each of the business problems 
which has been described. In the optimum-product-line and production- 
process problem there are maximum capacities to be dealt with. In meeting 
specifications at minimum cost, each specification is such an inequality. 
In the transportation-routing and plant-location problems, the presence 
of such restrictións on the businessman’s decisions is less obvious, but they 
are nevertheless there and play a fundamental role in the computation. 
There can be limitations on the size and cargo-carrying capacity of the 
trucks, trains, or ships to be routed. But the more relevant capacity limita- 
tion is a peculiar one which states that in no case is it possible to ship 
negative amounts from one place to another! This rather silly-sounding 
restriction is important partly because things like this are never obvious 
to an electronic computer and, unless it is specifically forbidden to do so, 
the computer will assign negative shipments from some supply sources to 
some destinations. For the machine will reason that if it is profitable to 
reduce some shipments to zero, it may be still more profitable to reduce 
these shipments even further! 

For the economic theorist, such nonnegativity requirements are im- 
portant for a far more fundamental reason. Like an electronic computer, 
marginal analysis is, by itself, incapable of taking account of them. To 
return to the more familiar optimal-output problem, for the competitive 
firm we note that the rule of the marginal analysis is that the output of 
any item should be at a level at which marginal cost is equal to price. But 
for an unprofitable item marginal cost may only be equal to price at an 
impossible negative output level. That is, in a case of increasing costs, 
even at zero output cost need not have fallen back to the level of price. 


3 The reason marginal techniques break down in the presence of inequality side 
conditions is easily illustrated with the aid of a simple graph (Figure 3b of Chapter 4). 
Marginal analysis finds, e.g., the point of maximum profits by locating the point at 
which marginal profit (the slope of the total profit curve) equals zero (output OQ, in 
the figure). That is the meaning of the standard marginal-cost-equals-marginal-revenue 
condition. But suppose output is limited by the inequality that production cannot 
exceed OQ;. Then our problem is to find out whether the point of maximum attainable 
profit is OQ; or some point to its left (which is no easy problem in the N-dimensional 
N-variable case). But for this the first-order conditions of the marginal analysis cannot 
be employed, for at the optimum point in the diagram, OQ;, the marginal criterion, 
marginal profit equals the slope of the total profit curve equals zero, is obviously invalid. 
See Chapter 4, Section 4, for a more complete discussion of the cases in which the calculus 
maximization criteria are not directly applicable. 
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Of course, no moderately sane economist making a graphic analysis will 
ever recommend a negative output. But where a large number of inter- 
dependent decisions have to be made, the calculations may all have to be 
done with the help of mathematical reasoning. And a mathematical analysis, 
based on marginal equalities such as marginal cost equals price, must in 
such a case yield nonsense results unless we impose on the calculation the 
explicit requirement that the variables be given no negative values. We 
.will come to this point again later. 


Programming, then, is the mathematical method for the analysis and 
computation of optimal decisions which do not violate the limitations 
imposed by inequality side conditions. In almost all cases the method of 
computation is a so-called iterative procedure. Just as the term “ragout” 
disguises the fact that it is only stew, though presumably an elegant one, 
this fancy term is used to dignify a systematic trial-and-error procedure. 
The answer to à programming problem will ordinarily not be arrived at 
directly. Instead the solution is found by groping toward it. But the trial- 
and-error procedure is not pure guesswork. It is systematic in that it 
usually involves at least the first two of the following features: 


1. There is a mechanical rule which determines, after each step, exactly 
what the next step is to be on iie basis of the results of the trial just com- 
pleted. One purpose of this feature of the method of solution is that it 
makes electronic computation possible. Most of these mechanical brains 
unfortunately possess no judgment of their own so they must be told 
what to do in every contingency. This is like teaching a human the rules 
of algebra before giving him an algebraic problem to solve. In any event, a 
mechanical rule stating what must be done at each succeeding trial in the 
trial-and-error procedure is useful because in a problem complicated by a 
great number of variables and interrelationships, human judgment can go 
badly wrong and can result in an inefficient, even totally ineffective, search 
for the answer. 


2. A second characteristic feature of the systematic trial-and-error 
procedure is à proof that the method has been constructed in a way which 
guarantees that each trial will yield values which are closer than the pre- 
ceding one to the correct answer. This very important feature assures the 
caleulator that he is always getting closer to his result and is not wasting 
his time by going off in a wrong direction. We shall see later, in our dis- 
cussion of the simplex method, how this sort of guarantee can be built 
into a computation procedure. Of course, such a guarantee can only be 
provided where there is a mechanical rule which specifies step by step what 
will be done. Otherwise, successive steps are unpredictable, and it is then 
not possible to say in advance whether they will be closer to or farther 
from the correct answer. 

3. For a large class of problems there are available trial-and-error 
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procedure rules which are guaranteed to yield precisely the correct result 
after a finite number of steps. In other cases where this is not possible, 
one can hope to calculate a maximum error, and to be able to say, for 
exaníple, that the result of the most recent trial is at most one-tenth of 1 
per cent from the correct answer. 

Where the problem is one involving linear programming, there are 
several computational methods many of which yield a precise answer after a 
finite number of steps. The simplex method, the method of fictitious play, 
and the complete-description method are all linear programming computa- 
tional techniques. In the next section we will see what is involved in the 
linearity of a programming problem. 


3. Algebra and Geometry 

First, let us set out the equations of a typical linear programming 
problem. 

Consider a profit-maximizing firm which can produce any of the four 
products w, z, y, and z whose outputs are W, X, Y, and Z and whose 
profits per unit of output are, respectively, 5, 3, 2, and 7. Then the total 
profits of the firm are given by 5W + 3X + 2Y + 7Z. (Here, e.g., W and 
X may represent outputs of the same product manufactured by different 
“processes,” i.e., with the use of different input proportions.) Suppose, 
moreover, the firm has available only 50,000 square feet of warehouse 
space and 32,000 machine-hours. If the manufacture of one unit of w 
requires 0.5 hours of machine time, that of z requires 2 hours of machine 
time, etc., we have an inequality relationship (which is called a “constraint” 
or "side condition") such as 


0.5W + 2X + 1.9Y + 3.1Z < 32,000, 
which states that no more of these outputs can be manufactured than the 
available machine time permits. Assume also that there is a similar ware- 
house-space constraint which is written out below. We can see now that 
the programming problem can be written: 


maximize profits: 5W + 3X + 2Y + 7Z 
subject to the constraints 
0.5W + 2X + 1.9Y + 3.1Z < 32,000 (available machine time) 
a) 10W 4-1.2X + 7Y + 4Z < 50,000 (warehouse capacity) 
and the nonnegativity requirements 
W>0, X20 Y¥>0, Z>0. 


This is the standard form for a programming problem. It consists of three 
parts: (1) the function (e.g., profits or costs) whose value is to be maxi- 
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mized or minimized, which is called the objective function, (2) the ordinary 
structural (capacity) constraints, and (3) the nonnegativity conditions on 
the variables, e.g., W > 0. 

This problem is called "linear" because the expression to be maximized 
and the inequalities involve only the variable multiplied by constants and 
added together (as in an equation for a straight line such as y = 5z + 3). 
There are no X”s, 5 sin Y's, log Z's, or more complex expressions. We may 
note that this linear model employs an assumption of competitively or 
otherwise fixed input and output prices and constant returns to scale in 
production.* These premises enter in two ways. First, from the information 
that profit per unit of w is 5, we can only compute the profit from pro- 
ducing 10 units of w at 5W = 50 because the assumption of constant 
returns to scale and constant input and output prices implies that costs, 
revenues, and profits will all rise precisely in proportion with the level of 
output. (We observe again that w is defined in such a way that it must 
always be produced with the same input proportions—in this analysis a 
change in the amount of scarce machine time used per unit of output is 
described as a shift from the manufacture of, say, w to that of z.) The 
linearity of the inequalities also rests on the assumption of constant returns 
to scale—the amount of warehouse space occupied by product y is assumed 
strictly proportionate to Y, the level of output of that item.5 

Let us now look at some programming gcometry. First, we must see 
how an inequality is represented graphically. Consider the inequality 
2X + Y < 5. (This may be interpreted, e.g., as a warehouse capacity 
limitation. In Figure 1 any point such as P on the line LL’ which 
represents the equation 2X + Y = 5 (and represents full use of 
capacity) clearly satisfies the inequality. But, 
in addition, any point such as Q, R, or S 
which lies below and to the left of LL’ also 
satisfies the inequality because it involves 
values of X and Y smaller than those which 
completely use up the capacity. We note, 
then, that while a two-variable equation is 
represented by a line, a two-variable in- 
equality is represented by a region. Indeed, a 
linear two-variable inequality is represented 
by drawing a straight line which divides the 
plane into two regions called "half-spaces," 
one of which contains all points satisfying the 
Figure 1 inequality. 


* For a definition and discussion of the “constant returns" case see Chapter 11. 

5 Where the facts of the situation do not warrant these assumptions even as an 
approximation, wéxmay*be forced to employ techniques of nonlinear programming. 
Usually these are, at best, moré complicated, as indicated in Chapter 7. 
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Let us see what happens if, in addition, the variables must satisfy a 
second inequality, say Y + X < 3 represented by the half-space to the 
left of line MM’. All of the points which satisfy both inequalities must 
lie below both lines MM’ and LL’, so that we are left with the region 
bounded by the broken line MKL’.* Further inequalities may bound the 
region from all sides—e.g., the addition of the inequalities X > 0, Y > 0 
leaves us with the shaded region in the diagram. This area is called the 
feasible region because every point such as Q which lies within it or on its 
boundary represents a combination of values of the variables (the output 
levels of X and Y) which does not violate the constraints, ie., any such 
output combination is within the firm’s capacity. For this reason, every 
point such as Q or P, in the feasible region, represents what is called a 
feasible solution to the programming problem. 

We can now represent the entire programming problem diagrammati- 
cally by adding to Figure 1 a third dimension which we use to represent, 
profits (Figure 22). A surface ORR'R" shows the profit that can be 
earned by any combination of our two outputs z and y. The XOY plane 
which constitutes the floor of the diagram is the same as the graph of 
Figure 1 or Figure 2b. Thus point K, for example, is a combination of out- 
puts X and Y. If this combination, say, is capable of yielding $100 in 
profits, we erect above point K the vertical line KK" whose length is 100 
units. The profit surface ORR'R" is the locus of all points such as K’ 
whose height indicates the profits which are yielded by the output com- 
bination represented by the point K directly below. 

In the linear programming case this profit surface ORR'R” is always a 
plane through the origin (equation: profit = aX + bY, where a and b are 
constants). 

Alternatively (Figure 2b) this situation can be depicted in a two- 
dimensional diagram where the profits are shown by iso-profit curves. Any 
two points on such a curve represent combinations of outputs X and Y 
which yield the same profits—e.g., profits at output combination E are 
the same as at output combination K. In a linear program these iso-profit 
curves are always straight lines, and they are parallel. For, a typical profit 
equation in a two-variable linear program is 


profit — 3X 4- 2Y. 


The curve representing a $50,000 profit level therefore has the equation 
50,000 = 3X + 2Y, ie, Y= —$X + 25,000 


* Note that while two linear equations in two variables will normally leave us with 
only one possible point (the intersection of the two straight Jines), two or more linear 
inequalities will often still leave us an unlimited number of points to choose from. Thus, 
there is nothing necessarily wrong with, say, a system of 5 inequalities in 3 unknowns 
even though we usually prefer to have no more equations than unknowns. 
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so that the profit indifference curve is a straight line whose slope is — $. 
Similarly, the line representing a $70,000 profit level will be higher than 
and of the same slope as the $50,000 line. The indifference curves of a 
linear program will therefore be a series of parallel straight lines. Further, 
moving to higher and higher curves will always increase profits." 

The objective of a programming calculation is to pick the optimal (in 
our examples, the most profitable) among the feasible output combina- 
tions. In geometric terms, this is represented by the point in the feasible 
region which happens to lie beneath the highest point on the profit surface 
ORR'R", i.e., the feasible point which lies on the most valuable profit 
indifference curve. 

It follows from the result that a move to a higher indifference curve 
always increases profits, that the optimal point of a linear program will 
always lie on the boundary of the feasible region.* The logic is simple. Any 
commodity whose production is profitable will continue to be lucrative as 
its output expands because there will be neither diminishing returns to 
scale nor unfavorable effects on input and output prices. It will, therefore, 
always pay to expand production until some capacity limit is reached, i.e., 


7 There is an exception to this result when the outputs in question bring in losses 
rather than profits. In such a case profits may decrease when we move to higher in- 
diflerence curves. However, this exception does not affect the rest of the argument. 

8 It will be shown in Chapter 7 that where a programming problem is nonlinear, 
most of the preceding theorems need not hold. The profit surface need not be a plane, 
the iso-profit curves need not be parallel straight lines, and an optimal point need not 
occur on the boundary of the feasible region. 
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equation in (3’) a unit increase in X; decreases S; by 3 units [the element, 
3, in the Sp row and X; column of matrix (5) ], and since there are 9,000 
units of S to begin with (the element in the Sp row and first column), So - 
production will be eliminated altogether with a 9,000/3 = 3,000-unit 
output of Xi. Hence Ss will be reduced to zero before we can reach the 
8,000-unit output of X, needed to eliminate output Sa. It is therefore im- 
possible to expand the output of X, beyond 3,000 units, and S; will be the 
output which is reduced to zero in the next basic solution; that is, the 
element, —3, in the S; row and X; column must be the pivot in matrix (5) 
because the ratio 9,000/3, the output of X; which reduces S, to zero, is 
smaller than 8,000/1, the corresponding figure for output Sq. This is our 
Rule 9 for the choice of pivot element. 


10. Illustration: Another Pivot Step 


This completes the discussion of the simplex-method computation. 
However, the reader may have found it confusing on first reading, and a 
careful description of the computation of the next matrix may help to 
pull the preceding material together into a clear pattern. We notice that 
in the previous matrix (7) there is only one column (the last) whose top 
element is positive. The pivot must therefore be chosen from this column 
(Rule 8). Both remaining elements in this column are negative, and to 
choose the pivot we must find the quotients obtained by dividing the 
corresponding first-column elements by these. The quotients are 

a —3750 and 2:009 
S =e 


= —4,500. 


Since the former is the smaller quotient in absolute value, —$, the element 
in the S, row and X» column, becomes the pivot element (Rule 9). That 
means the roles of variables X» and S, are to be interchanged in the next 
matrix, where S, will be reduced to zero and Xs will become positive. The 
new matrix is then produced as follows: 


1. Interchange X5 and S, in the column and row headings so that the 
third column is now labeled S; and the second row is called X od 

2. The pivot element —$ in (7) is replaced by its reciprocal —%, and 
the rest of the elements of the pivot row are replaced by the corresponding 
old elements in (7), each multiplied by —1 and divided by the pivot (—$), 


20 The reader who omitted the special computational techniques of Section 8 is 
reminded that the remainder of this section [except for matrix (12) and the last para- 
graph of the section] is just a review of these methods so this material should be skipped 


by him. 
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as shown in matrix (12a) (Rules 4 and 5): 


1 S Sa 1 8$ Sa 
= Gis 

(12a) X: = | — = = -$ |= X:=|3750 1 -38 
X = X= 


3. The remaining elements of the pivot column in (7) are replaced by 
the corresponding old elements each divided by the pivot (—$) as shown 
in matrix (12b) (Rule 6): 


Lo 48). S 
Z= E 

(12b) X= 
X= à 


4. Finally, the remaining elements in matrix (7) are replaced by use 
of formula (10) of Rule 7 as shown in matrix (120): 


l Sb Sa 
(5,000) (3) 5 G) 
Z= 73500) === = as 
(120) X,- 
Xi-| 3,000 .0000(-D — 1 ((-9 
=$ 3 m 
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If we now combine the information in (12a), (12b), and (12c), we obtain 
our new matrix (12): 


1 S Sa 

Z-| 870 -2 -2 

(12) X: = | 3,750 E 
Xi-| 50 —4 3 


It will be noted that all (except the first) of the elements in the top row 
are negative. These negative numbers are the marginal profitabilities of 
the items Sẹ and Sa, whose outputs are now zero. Hence, by Rule 3, the 
corresponding basic solution is optimal (any nonzero value of S, and Sa 
must decrease profit, Z). By (4) this solution is X, = 3,750, Xi = 500, 
S, = S, = 0, and it yields Z = 8,750 in profit. 


PROBLEMS 


1. Maximize E = 3x + 7y + 6z subject to 
2a + 2y + 2z < 8 
s+ y <3 
z>0,y>0,2>0. 
2. Maximize R = 4x + 3y subject to 
z4-35y <9 
2+ y<8 


z+ yS6 
z>0,y>0. 


3. Maximize R = 4z + 6y subject to 
irc yS4 
2+ y € 8 


4z — 2y € 2 
z20,y20. 
4. Maximize R = 4z + y subject to 
rd42y € 5 


3z4-2y <4 
z20,y20. 
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1]. Cases Where the Origin is not a Feasible Solution 


We started our discussion of the simplex computation by assuming that 
a feasible basic solution: ean be found by setting all variables except the 
slack variables equal to zero. In this way we obtained our first basic solu- 
tion (4) from which all subsequent trial solutions were derived. Unfortu- 
nately, many programming problems are inconsistent with such a solution. 
Some of the constraints may lack slack variables because they were equa- 
tions to begin with. Even if this difficulty does not arise, a basic solution in 
; which all variables, except the slack elements, are given the value zero may 
not be feasible. 

For example, consider a program with an inequality of a sort that arises 
typically out of a minimum-requirements specification. A simplified illus- 
tration is an advertisirig budgeting problem that aims to minimize the cost 
of getting 160 million exposures among noncollege audiences (the number of 
times one of the company’s advertisements is seen or read by some such 
person—two persons seeing the same ad or one person seeing two different 
ads each counts as two “exposures”). Because of the nature of the product, 
the company wants at least 60 million exposures to involve persons with 
family incomes over $8,000 per year, and at least 80 million exposures to 
involve persons between 18 and 40 years of age. The issue is to decide 
how to divide the company’s budget between magazine and television 
advertising. Survey information indicates the size and composition of the 
audiences of the two media. The following table provides all the relevant 
information: 


Magazine Television 
Cost per ad (thousands of dollars) 40 200 
Noncollege audience per ad (millions) 4 40 
Audience (per ad) with income over $8,000 
(millions) 3 10 
Audience (per ad) ages 18-40 (millions) 8 10 


This states, for example, that each magazine advertisement reaches 
4 million noncollege graduates, 3 million persons with an income over 
$8,000 per year, etc. Now let m and ¢ represent the number of magazine and 
television advertisements under consideration by the firm, and let « be its 
total advertising cost. The cost minimization problem, then, is given by 
the following program: 
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Minimize « = 40m + 200t 


subject to 
4m + 40t > 160 (noncollege requirement) 
(13) 3m-4-10t > 60 (income requirement) 


8m + 10t > 80 (age requirement) 
mzao0 B0. 


Using slack variables the constraints may be rewritten as 


4m -+- 40t — lı = 160 
(132) 3m + 10t — la = 60 
8m + 10t — l3 = 80 

m > 0, i> 0, 1, > 0, lə > 0, l3 > 0. 


Here, the nonnegative slack variables, lı, l2, and Is, must be subtracted from 
the left-hand sides of their respective constraints because (unlike the 
machine time-warehouse capacity constraint problem described earlier in 
the chapter) the activity levels now to be decided upon must produce a 
result (e.g., reach an audience) greater than or equal to some predetermined 
level (160 million), not less than or equal to some capacity figure, as before, 
Hence, if the advertising campaign were to produce an acceptable 164 
million noncollege exposures, we must subtract lı = 4 million from this 
number to make it equal to the 160 million requirements figure. 

Suppose, now, that we try to proceed as we did previously and attempt 
to obtain an initial feasible basic solution by setting both of our structural 
variables, m and t, equal to zero. By (13) this yields the absurd result 
0 > 160, 0 > 60, 0 > 80, or, alternatively, by (13a), —160 = lı 2 0, 
—60 = lp > 0, and —80 = l; > 0, none of which, obviously, can be true. 
Hence the approach we used before to begin our simplex calculation, taking 
the origin as our initial basic solution, just does not work. The reason, in 
economic terms, is very simple. Where our constraints represent maximal 
capacities as in the output problem earlier in this chapter, a solution in 
which all outputs are zero is certainly feasible (if not very profitable). Zero 
outputs will obviously not overstrain the capacity of the firm’s facilities. 
But now, where the constraints represent minimum results that are re- 
quired for acceptability, zero activity levels will not do the trick. The firm 
cannot possibly obtain 160 million audience exposures with zero advertising 
of both varieties. It follows that the easy method we have used previously 
to find our initial basic solutions, that is, setting all our structural variables 
equal to zero, does not always work. i 

We will see in the next chapter that there is an easy alternative procedure 
that works in most problems that arise naturally out of economic issues. 
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Duality theory will permit us to substitute for a problem such as (13) 
another problem, called the dual, for which the origin is a feasible basic 
solution. In the course of finding the solution to this dual problem we will 
see that we also obtain, automatically, the solution to the initial ("primal") 
problem (13).? ' 

For the moment we will have to content ourselves with a graphic solu- 
tion of problem (13) which should be illuminating. In Figure 4a we have 
represented the three constraints in (13). For éxample, the reader should 
verify that the line EE! labeled “exposures” represents the equation 
4m + 40t = 160; i.e., it gives all combinations of m and t that just barely 
satisfy the first constraint, which requires a total of 160 million audience 
exposures. Similarly II’ and AA’ correspond, respectively, to the income 
and age requirement constraints. 


o 10 20 30 m 


Figure 4a Figure 4b 


Now, since our constraints are all “greater-than-or-equal-to” require- 
ments, feasibility means that each and every one of the minimum require- 
ments represented by the constraint lines must at least be met. That is, any 
point, if it is to represent a feasible combination of m and t, must lie on or 
above each and every one of the constraint lines. The feasible region therefore 
is the shaded region representing all points that do not lie below any of the 
constraint lines. This is the shaded region on, or above and to the right of, 
the polyhedral boundary KA BCE'F. 


21 An appendix to the following chapter will also describe an alternative and somewhat, 
more time-consuming calculation that works generally, even in the unusual eases where 
the dual method does not. 
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Our objective now is to minimize total costs—to get onto the lowest of 
the iso-cost lines that can be reached from the feasible region (Figure 4b). 
To find these lines we utilize the total advertising cost function, a = 40m + 
2002. Any iso-cost line is obtained by setting o equal to some constant. For 
example, taking a = 200 we obtain the equation 200 = 40m + 200t which 
is represented by the straight-line segment labeled a = 200. All the other 
iso-cost lines are parallel to this one, since a change in the value of a does 
not affect the coefficients of m and ż in the cost equation and, hence, does 
not change the slope of the iso-cost line. 

The lowest of these iso-cost lines that can be reached from the feasible 
region is obviously the one labeled aopt, and it touches the feasible region 
at corner point C. Thus, C represents the optimal solution and we read from 
the graph that it corresponds, at least approximately, to the values m = 10, 
$= 3, 

That completes our graphic solution. However, several additional 
observations may be illuminating. Once we have found the approximate 
solution to a linear program graphically, it is generally easy to go on to 
determine the solution values precisely. As usual in linear programming, a 
solution will occur at some corner of the diagram. In the two structural 
variable cases that can be handled diagrammatically, two constraint lines 
will always meet at such a corner. Thus, once we know which corner and, 
hence, which constraint lines go through the solution point, we can find the 
coordinates of the solution point by solving the two corresponding constraint 
equations simultaneously. In the case in Figure 4 the optimal point C is 
the intersection of the exposures and the income constraint lines. Conse- 
quently, we can find the values of m and t by solving simultaneously the two 
corresponding constraint equations 


4m + 401 = 1 


3m + 10t 


60 
60. 


The reader can verify directly from these equations that the optimal solu- 
tion is in fact given exactly by m = 10, t = 3, the values we had read off from 
Figure 4. 

There is one other observation that we can usefully draw from the 
graph. We note in Figure 4 that the origin lies outside the feasible region. 
This is in clear contrast with the cases represented in Figures 1, 2, and 3, 
where the origin is a corner of the shaded feasible region. That gives us 
another way of seeing why, in our earlier calculations, we could start off by 
taking the origin as our initial basic-feasibie solution, but. why we can no 
longer do so in the problem (13) which we are now discussing. 
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Duality 
6 


1. The Dual Problem 


With every linear programming maximization problem it has proved 
useful to associate a closely related minimization problem and vice versa. 
Such pairs of problems are called dual linear programming problems.' The 
analysis of dual programming problems has attracted a great deal of atten- 
tion among both economists and mathematicians, for a number of reasons: 


1. Duality yields a number of powerful theorems which add sub- 
antially to our understanding of linear programming. 
st y 
2. Duality analysis has been very helpful in the solution of pro- 
gramming problems. Indeed, as we shall see, it is frequently easier 
to find the solution of a programming problem by first solving its 
associated dual problem. | 
3. The dual problem turns out to have an extremely illuminating 
: mic interpretation which, incidentally, shows that old-fashioned 
on Fial analysis is always implicitly involved in the search for an 
optimal solution of à linear programming problem. 
ic i tation of the dual, it will be 
n to the economic interpre : : 
de Bae a a the meaning of duality 1n & purely abian, "i 
sinl any reference to meaning Or interpretation. At the end of this 
sann. Th 
1 The basic ideas of duality theory were first suggested a s EJ EC en 
theory was first vigorously-developed. by David Gale, Haro ; 
195 
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section, given any linear Programming problem, the reader should be able 


to write 


Consider the general linear programming problem which we will call 
the primal problem: 


(1) 


maximize H = PiQi + PQ: + +++ + P,Q, 
subject to GuQi + aQ: + +++ 4 ann X Ci 


mii + amQ + gey F amna < Cn 
Q 2 0, +++, Qn > 0. 


Suppose a mischievous gremlin were let loose on this linear programming 
problem and decided that he would turn everything he possibly could on 
its head. There are a number of obvious things he would think of. For the 


would substitute <. Furtherm, ‘"@, a nice way to add to the resulting con- 
fusion might be to put the capacity figures C4, C5, ... , C» where the unit 
profit figures, P4, P2, +- - , Pa, used to be and vice versa. For good measure 
he might reverse the order in which the constants appear in the inequalities, 


would now read down. That is, where 212 was formerly the second constant 
in the first inequality, he would now make it the first constant in the 
second inequality. In other words, by reading across in the first inequality 


by reading down from the first inequality to the second to the third, etc. 


Finally, to cap the confusion, our gremlin would probably decide to get 
rid of our original variables Qi, Q2,- , Qn altogether and substitute for 
them an entirely new set of variables Vi, V, ..., Vm. Having done all 


ae new program produced by the gremlin is precisely what we call the 
ual. Bef 


minimize e = CW, + CV, + --- + Cy, | 
Subject to aunVi + anVs + --. + AmVn > P, 


Os V1 + a Vs + an Tow. > P, 
Vi2 0, V2 2 0, --., Vn > 0. 


ore going further, let us recapitulate the characteristics of the 
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two problems which give them their remarkable symmetry: 


1. If the primal problem involves maximization, the dual in- 
volves minimization, and vice versa. 

2. If the primal involves > signs, the dual involves < signs, and 
vice versa. 

3. The profit constants P; in the primal problem replace the 
capacity constants Ci, and vice versa. 

4. In the constraint inequalities the coefficients which were 
found by going from left to right are positioned in the dual from top 
to bottom, and vice versa. j 

5. A new set of variables appears in the dual. 

6. Neglecting the number of nonnegativity conditions, if there 
are n variables and m inequalities in the primal problem, in the dual 
there will be m variables and n inequalities. 


Finally, it should be noted that if we were to let our gremlin loose 
again and have him do his work on the dual problem, he would end right 
back with the problem with which he started. That is, if he were to take 
the dual problem and subject it to all the abuses which he had heaped on 
our original linear program, he would find that in the end he would have 
undone all his mischief. The dual of the dual problem is the original linear 
programming problem itself. It follows that, given such a pair of problems, 
it is entirely arbitrary which of them is referred to as the primal and which 
as the dual. Each one of them is the dual of the other. 

These rules for the construction of the dual are illustrated in the follow- 
ing pair of numerical linear programming problems. The reader should 
verify that these problems do indeed constitute one another’s dual. In 
order to emphasize that minimization problems also have a dual we have 
selected as our illustrative primal program the minimization problem (13) 
of Chapter 5. 


Primal Problem Dual Problem 


minimize 
a = 40m + 200t 
subject to 
(2) 4m + 40¢ > 160 
3m + 10t > 60 
8m +10 > 80 
m>0,t>0 


maximize 
R = 160V; + 60V; + 80V; 
subject to 
AY; + 3V: + 8V; < 40 
40V, + 10V; + 10V; < 200 


Yi > 0,V: > 0, V: > 0. 


It will be observed, in our illustration, that the primal problem involves 
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three variables and two structural constraints, whereas the dual problem 
involves two variables and three constraints. It should also be pointed out 
that one feature is not tampered with in going from a linear programming 
problem to its dual—the inequalities of the nonnegativity condition retain 
their directions. That is, both in the primal and in the dual problem each . 
variable is required to be greater than or equal to zero. 

Before proceeding further it is convenient to rewrite our general primal 
and dual programs, this time including the slack variables so that the 
structural constraints become equations. We obtain 


Primal Problem Dual Problem 
maximize minimize 
T= PQ PI. a = OVi + b CSV 
subject to subject to 
auQi + 20+ + Qs + Ur = Cr anVit cba — Li = Pi 
(3) eee ae ers S 3 = 
dmQi +++ + amna + Um = Cm Vit ee + mnVn — La = Ps 
Qi > 0, +++, Qn = 0, Vi 2 0, +++, Vn > 0, 
U, > 0, +++, Un = 0 Lı > 0, +++, Ln 2 0. 


Here the U; and L; represent, respectively, the primal and dual slack vari- 
ables. It should be noted that in our dual program, since the constraints 
are “greater than or equal to” inequalities, the slack variables L; are 
subtracted from the left-hand side of the equations. To give a simple ex- 
ample, in translating the inequality 3Q; > 7 into an equation we must 
rewrite it as 3Q, — L = 7, where, as usual, L > 0. 


PROBLEMS 
1. Write out the dual of the following problem: 
maximize II = 6Q: + 20» 


subject to 
404+@<5 


3Q1 + 2Q2 < 7 
Qi Q$€3 
Qz0 Q20 
2. Show that the dual of your answer is the original programming prcblem. 
3. Write out the dual to the program in Problem 1 in slack variable form. 
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2. Economic Interpretation of the Dual Problem 


It is convenient to begin our interpretation of the dual problem in a 
manner which seems extremely artificial. But the reader will see the 
artificiality of the interpretation disappear rapidly as the analysis proceeds, 
and the very real significance of the duality theorems will make this 
abundantly clear. 

Let our primal program be a standard production problem, that is, 
the problem of determining the profit-maximizing output levels for the 
firm’s various products, subject to a number of scarce input (capacity) 
constraint limitations. The costs of the company’s fixed inputs Ci, Cs, ***, 
Cm may not enter directly into its accounting profit calculations, particu- 
larly if the warehouses, factories, and other facilities which these symbols 
represent have been completely amortized. Nevertheless, it must be recog- 
nized that without these inputs the firm could not have earned its profits. 
Suppose, then, that the businessman in question decides to determine 
what portion of the profit on each of his products he "owes" to each of the 
inputs. In the economist’s jargon, he will undertake to impute all of the 
company’s profits to its scarce resources. For this purpose, he will seek to 
calculate an artificial accounting price or value V for each of his inputs 
and to choose a magnitude for each V such that the sum of these computed 
values of the scarce inputs going into any of his products, say shoes, is 
high enough to account for his profits on shoe production. That is, values 
of the V’s should be chosen so that, if possible, the net profit from shoe 
production (and the net profit from each other company output) would 
be zero if each searce input C; were to cost V; dollars per unit.? 

It must be emphasized that the zero profit condition in this problem is 
not directly related to the zero profit requirement for long-run equilibrium 
under perfect competition. In our imputation analysis zero profit is an ac- 
counting requirement. If accounting values V are proposed which do not 
impute profits completely to the scarce inputs, these accounting values 
must be raised until the unimputed residue has been eliminated. 

We shall now show that the variables V1, V2, and Vs in the dual program 
of the numerical problem in the exercises of the preceding section can 
be interpreted as the required accounting values of the firm’s three scarce 
resources. Let us assume that our three scarce resources are, respectively, 
warehouse space, machine time, and inspection time, so that our orig- 
inal linear programming problem tells us that the firm has available to 
it 5 (million) cubic feet of warehouse space, 7 (hundred) hours of machine 


2 We will see that some commodities which the firm considera producing may end up 
yielding negative accounting profits. That is, their gross profits are less than the value 
imputed to the scarce resources needed to produce them. As the reader may surmise, 
these are the items which will be excluded from the firm’s optimal product line. 
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time, and 3 (hundred) hours of inspection time. V1 may then be described 

*as the accounting value to the firm of a unit of warehouse space, V» as the 
accounting value of a unit of machine time, and V3 as the accounting value 
of inspection time. Suppose we tentatively accept this interpretation. 

Let us now look at the first inequality in the dual problem and show 
that it is no more than an explicit statement of our no-accounting-profit 
requirement for the first item in the firm’s product line, Q,. It will be 
remembered that the coefficient 4 of the variable V; in that inequality 
had a very definite meaning in our primal problem. There it represented 
the number of units of warehouse space used to produce one unit of output 
1. Similarly, the second coefficient 3 in our dual inequality, that is, the co- 
efficient of Vo, in our primal problem represented the amount of machine 
time needed to produce a unit of output l. Finally, the coefficient of the 
Va in our dual inequality represented the number of units of inspection 
time needed to produce a unit of output 1. In sum, the three coefficients, 
the 4, and 3, and the unity represent the quantities of the three different 
inputs which go into a unit of commodity 1. Now, if each unit of warehouse 
space is worth V, dollars, then four units of warehouse space would be 
worth four times V, dollars and, similarly, if each unit of machine time is 
worth V » dollars, the three machine-time units would be worth 3V » dollars, 
and, finally, the one unit of inspection time would be worth one times V3 
dollars if each unit is worth V3. We see, then, that the expression on the 
left-hand side of our inequality, 4V, + 3V3 + 1V3, has a straightforward 
economic interpretation, given the meaning which has tentatively been 
assigned to our variables. That sum is the total value of the inputs which 
is necessary to produce one unit of output of commodity 1. 

Only one more step is needed to complete our interpretation of this 
inequality. We must now recall what is signified by the number 6 on the 
right-hand side of that relationship. Going back to our primal problem 
once more, we see that the 6 is the unit profit one obtains by producing our 
first commodity. Each unit of commodity 1 which is manufactured yields 
$6 to the firm. Our first inequality can now be read to state the following: 
The value of the inputs going into the production of a unit of commodity 1 
must be greater than or equal to the profit which the firm makes by pro- 
ducing a unit of commodity 1. Our first inequality, then, states that we 
must assign to each of the inputs a value sufficiently great to impute to 
them all the profits of output number 1. Just as the first inequality of the 
dual program serves to impute the profits of commodity 1, so the second 
inequality imputes all the profit of each unit of output 2 to the company’s 
scarce inputs. It states that the values of the three inpuis which are used 
to make a unit of commodity 2 must account fully for the $2 of profit which 
are yielded by a unit of commodity 2. 

We will see presently why it is convenient to write these constraints 


Part 1 Duality 111 


as inequalities rather than equations, that is, why we do not directly re- . 
quire that values of the inputs be exactly equal to the profits. However, 
one simple reason can be indicated now. The need for inequalities in the 
constraints follows directly from the fact that the number of variables 
and the number of structural constraints in a programming problem need 
not be the same. In our dual problem we have three variables and two in- 
equalities. We could just as easily have had, say, six variables and fifteen 
inequalities. If we had attempted to write those fifteen constraints as equa- 
tions rather than inequalities, we would have had a system involving 
fifteen equations in six unknowns, and obviously this is likely to run us into 
difficulties, for usually it is impossible to satisfy a system of equations con- 
taining more equations than unknowns. Since, therefore, in such a situa- 
tion we may be forced to relinquish equality in part of the system, we have 
chosen between the two apparently less desirable alternatives, overimputa- 
tion and underimputation, and decided to favor the former. That is, we 
have said, “If it is absolutely necessary to assign a value to the inputs ` 
employed which is either greater than or less than profit, let us assign a 
total value which is greater than the profit.” 

But once we have stated the inequalities in this way, it may appear 
that there is no problem at all in solving our linear program. We need just 
assign values as capriciously high as we wish to each of the inputs and we 
can be sure that they will more than account for all of the profits. What 
prevents this sort of arbitrary solution is that our dual problem requires 
us to minimize a = 5V1 + 7V2 + 3Vs. The value of the dual objective 
function, e, also has an economic interpretation which follows directly 
from our preceding discussion. It will be recalled that the company has 
five-units of warehouse space in its possession so that the total value of the 
warehouse space available to the firm will be 5V; and, similarly, the total 
value of the machine time available to the firm will be 7V » because there 
are seven units of machine time at the firm's command, and so on. Hence 
a = 5V,;+7V2 + 3V3 represents the total value of all of the inputs which 
the firm has under its control. To summarize, then, the dual problem re- 
quires us to find the very smallest valuation of the company’s stock of inputs 
which completely accounts for all of the profits of each of the outputs. 

We may conclude this preliminary discussion of the interpretation of 
the dual problem by ascribing an economic meaning to the dual slack 
variables Lı, :*:, Ls. From our dual problem in (3) we see that, for ex- 
ample, L is given by 

Li (üiiVi + a21V2 ++ ++ + aV m) — Pj. 


But P, is the profit per unit of output 1, while, as we have just seen, the 
expression in parentheses is the accounting value of the resources used in 
producing a unit of output 1 (since a1; is the amount of input 1 used in a 
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unit of output 1 so that a11V is the value of the amount of input 1 used in 
the production of output 1, etc.). We can, therefore, rewrite the last 
equation as 


Lı = (value of resources going into a unit of output 1) 
— (the unit profit of output 1). 


Thus, we may consider L; to represent a sort of relative loss figure for product 
1, for if L, is positive it tells us that the resources used in producing output 
1 are worth more than the profits yielded by that commodity. We shall 
presently return to these dual slack figures and see precisely what is meant 
by the “loss” which they represent. 

Let us now review the interpretation we have given the variables in 
our pair of programming problems. Our primal and dual problems involve 
the following four types of variable: 


Q; the (quantity of) output of product j (the primal ordinary variables) 
U: the unused capacity of input i (the primal slack variables) 

V; the value (accounting price) of input 7 (the dual ordinary variables) 
L; the accounting loss per unit of output j (the dual slack variables). 


Primal variable Q; and dual variable L; then, both refer to outputs, 
while dual variable V; and primal variable U; both refer to inputs. This 
information is summarized schematically in the following table: 


Primal Variable Dual Variable 
(physical quantities) (in money units) 
Refers to outputs Q; L; 
Refers to inputs U: Vi 


The interpretation of these four types of variable must be understood 
clearly before the reader can hope to master the economie implications of 
the dual. It should also be noted that the primal variables Q; and U; refer, 
respectively, to physieal output and input quantities and so must be 
measured in units such as tons, square feet, kilowatt hours, etc. On the 
other hand, both dual variables V; and L; refer to pecuniary magnitudes 
and can be measured in a monetary unit such as the dollar. 

So far the reader may well have the feeling that our interpretation of 
the dual program is rather strained. We have forced an odd sort of eco- 
nomie meaning onto our dual but there appears to be little reason for an 
economist or anyone else to be interested. However, the mathematicians 
have proved a number of theorems about the dual problem which dra- 
matieally breathe life and power into the entire construct. 
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3. Some Duality Theorems 


In this section several of the most pertinent theorems of duality theory 
will be described rather briefly. They and their derivations are discussed 
in somewhat greater detail in Appendix A to this chapter. 

As a preliminary matter, we may mention the following rather useful 
theorem. Suppose we have found any feasible solution to the primal problem 
and this solution yields as the value of the objective function (the total 
profit figure) the number II*. Suppose, moreover, that we have found some 
feasible solution to the dual problem which yields the number o* for the 
value of the dual objective function. Then II* will never exceed a*, i.e., 
we will have II* < a*. This theorem tells us, in effect, that the company’s 
total profit II will never exceed o, the accounting value assigned to the 
company's scarce inputs. Actually, that result is hardly surprising since, 
as we saw in the last section, in constructing the dual problem we had 
done it in a way which may well overimpute company profits but will 
certainly leave no part of company profits unimputed to the scarce inputs. 
Somewhat less obvious is 

Duality Theorem I: 'The maximum value of II will be exactly equal to the 
minimum value of o, and any pair of feasible solutions for which a = II 
must be optimal. 

This theorem states that despite our apparent willingness to over- 
impute profits if necessary, everything comes out well in the end! When 
optimal solutions have been found for the primal and dual problems we 
will have assigned a total value a to the company's scarce resources which 
is exactly equal to total profit II! As an example, it will be recalled that in 
the illustrative problem of Sections 6-10 in Chapter 5, maximum profit 
was equal to 8,750, as was shown in matrix (12). Theorem I then tells us 
that the minimum value of the objective function of the dual problem will 
also be « — 8,750. Furthermore, it tells us that if we find any primal and 
dual solutions for which œ = II = 8,750, we need look no further. We will 
then have found our optimal solutions. 

Another basic duality result is described by the next proposition: 

Duality Theorem II: In a pair of optimal solutions, the firm will produce 
only commodities whose accounting loss figures L are zero. In addition, 
in such optimal solutions only inputs which are used to capacity will re- 
ceive a nonzero accounting valuation V. Moreover, any pair of feasible 
solutions to the primal and dual problems which satisfy this requirement 
must be optimal! 

Symbolically, this theorem may be written: 

in an optimal solution Q;L; = 0 for each commodity j, 
and U,V; = 0 for each input i. 


This set, of equations is referred to as the complementary slackness conditions. 
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To see the connection between the symbolic and the verbal statements of 
Theorem II, note that, e.g., the equation QiL, = 0 requires either that 
Qı = 0 or that Lı = 0 (or both). This means that if Qı > 0 (if output of 
commodity 1 is positive), then we must have Lı = 0 (its accounting loss 
must be zero). Similarly, if L, > 0 (commodity 1 involves an accounting 
loss), then we must have Q, = 0. The interpretation of the other equations 
in Theorem II is similar and is left to the reader. 

Theorem II, then, is reassuring. It tells us that the dual values make 
some economic sense—inputs which are in excess supply to the firm are 
given a zero accounting value V and commodities which are associated 
with a nonzero accounting loss are those which, optimally, should not be 
produced. 

Looked at the other way, the theorem tells us that from the optimal 
solution of the dual problem we can immediately make a number of pre- 
dictions about the solution to the primal; we can be sure, in advance, which 
outputs will not be produced and which inputs will be used to capacity. 
For example, as will be shown later in this chapter, the solution to the dual 
of the illustrative linear programming problem (3’) of Sections 6-10 in 
Chapter 5 is Lı = 0, L; = 0, V, = i, Vo = $. With the aid of Theorem 
II these figures tell us to expect both outputs 1 and 2 to be produced and 
that both inputs will be used to capacity. Indeed, this turns out to be the 
case, for it will be recalled that the solution to the primal problem is given 
by matrix (12) of Chapter 5 and is Qı = 500, Q: = 3,750, U, = 0, U, = 0. 

We may also observe, specifically, that all the equations of Theorem II 
are satisfied by the pair of solutions just given, for we have 


-QıLı = (500)(0) = 0, QL, = (3,750) (0) = 0, U,V, = (0) (1/4) = 0 
and 


, 


UV, = (0) (3/4) 


0. 


Perhaps the more surprising part of the theorem is its converse portion— 
the assertion that any pair of feasible solutions for which all Q;L; = 0 and 
all U;V; = 0 must be optimal. It is rather plausible that an optimal evalua- 
tion of the L’s and V's should have us produce no item which involves a 
"loss" L and should impute no value to an input which goes partially un- 
used, but intuitively it is not easy to see why no more than this is required 
to guarantee optimality in a pair of solutions. Nevertheless, this converse 
proposition is valid and its proof, which is not very difficult (once Theorem 
I has been derived) is given in Appendix A of this chapter. 

Further duality theorems tell us more about the meaning of the dual 
variables V; and L; and should convince the reader of their real economic 
significance. 


First, let us examine the input valuation variables V;. As an example, 
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suppose input ? is machine time. Then, as we have said, V; is simply the 
dollar value which is assigned to an hour of time of this piece of equipment. `. 
Now it is natural for the economist to feel that the value which should be 
assigned to time on such an item is its marginal revenue product (its 
marginal contribution to the profits of the firm). And this is usually what 
V; turns out to be. As is indicated in Appendix A, normally V; is equal to 
the marginal profit contribution of input i! That is, V; tells us (approximately) 
what would be added to the firm's profits if somehow the company could 
increase its available machine time by one hour (per day). More precisely, 
(wherever the profit function II has a derivative) we have 


V; = on/eC;, 


where C; is the total capacity of input (the total amount of machine time 
available to the firm). In other words, if we were to make two independent 
programming calculations, one calculation with the aid of the primal 
problem alone to determine the value of 9II/8C; and another calculation 
with the aid of the dual to find the optimal value of V;, comparison of the 
two figures after both computations were completed would show them to 
be precisely equal! 

Let us turn now to our other pecuniary variable L; the dual slack 


3 To determine the marginal profitability of input i, we may proceed as follows. 
Suppose the available amount of input ‘tis, say, C; = 9; solve the primal problem for 
the optimal value of II using this C; figure. Now substitute for C; = 9 the value C; + 
AC; = 9 + AC; (where some very small number may be used for the AC;). Then solve 
the primal problem again, only with this increase in the capacity of input 7, to yield the 
increased profit figure II + AN. It is now a straightforward matter to find AII/AC;. As 
an example, the reader may verify that, if in our illustrative programming problem (3") 
in Chapter 5, the first capacity figure Ce is increased from 8,000 to 8,001, the optimal 
solution matrix which was previously given by (12) in Chapter 5 now becomes 


Thus C, has increased by AC, — 8,001 — 8,000 — 1, while II has risen by AN = 8,7501 — 
8,750 = 3. Hence, we have AII/AC, = 1. Comparison with the dual value as given later 
in this chapter shows that V, — 4 = AH/AC,, as our theorem asserts. However, it 
should be pointed out that this is not always sure to happen, for our theorem tells us that 
V; = oH/8C;, and not that V; = AII/AC;, i.e., the theorem only holds in the limit, as 
AC; approaches zero, and then only if there is no discontinuity in the profit function 
which would prevent differentiation. 
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variable. This is the accounting loss per unit of output j. That is, we have 
from our dual constraints 


Li = Vay; + Voaz; +--+ V mani — P5 


where P; is the actual profit per unit of output j and the remaining terms 
represent the accounting costs of the scarce inputs used up in producing a 
unit of item j. By construction, these accounting costs are designed to 
impute profits completely to the firm’s scarce resources, so, at best, for 
any output j the unit profit P; will just cover the accounting cost >> V ;a;;. 
The accounting loss L; associated with such items will be zero. On all other 
items we will have L; > 0, ie., there will be a net accounting loss. 

What is the meaning of such an accounting loss? It only implies that 
the inputs used in producing the item in question would be more valuable 
elsewhere. That is, the profit on this item does not cover the marginal con- 
tribution to profitability which could be obtained by using the same in- 
puts in the production of some other goods. Thus, the accounting losses 
are only losses relative to the most profitable alternatives. To illustrate 
the point, suppose it took the same scarce resources to produce a handbag 
and a pair of shoes and that their unit profitability were, respectively, 
P, = $5 and P, = $7. If these were the only outputs being considered by 
the firm, we would then have L, = $2 and L, = 0. That is, even though the 
firm nets $5 on every handbag it produces, it still loses $2 compared to the 
amount it could be making by transferring these resources to shoe produc- 
tion. The accounting loss on shoe output L, is zero because shoe manu- 
facturing is the company's most profitable alternative, ie., it loses no 
opportunity for further gain by keeping bottleneck inputs tied up in shoe 
production. Thus Ly and L, turn out to be what the economist haa always 
called the opportunity costs of these items. 

It is noteworthy, then, how these time-honored concepts of economic 
theory, the marginal product and the opportunity cost, have sneaked 
back into the analysis. No one has put them into the analysis of the primal 
production problem which proceeds largely in terms of the relevant physical 
and technological considerations. Yet, always hiding behind this primal 


cratic the bias of the planner and how abhorrent to him are the unplanned 
workings of the free market, every optimal planning decision which he 
makes must have implicit in it the rationale of the prieing mechanism and 
the allocation of resources produced by the profit system. It is noteworthy 
that these results have led to the open and well-publicized reintroduction 
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of marginal analysis into Soviet economies by Russian mathematicians 
working on the application of linear programming to economic planning. 


4. Duality and Decentralized Decision-making 


There is another related aspect of the matter which has attracted con- 
siderable attention. The dual accounting prices V; can serve as a device 
for steering decentralized decision-making along an optimal course. Con- 
sider, for example, a firm with a large number of plants each of which makes 
some use of over-all company resources. If plant A uses more of the com- 
pany's central warehouse space, less will be available to plant B and vice 
versa. Top management can, of course, allocate its bottleneck resources 
among the various plants by making a master plan and deciding, plant by 
plant, input by input, how much should go where. This is the method of 
direct centralized planning and decision-making. 

However, duality theory points to an alternative approach. Suppose 
management is in a position to calculate the values of the dual accounting 
prices V; by an ordinary programming computation. (The enormous 
practical problems likely to be involved in obtaining the required statistics 
and the difficulties caused by nonlinearities must, however, not be for- 
gotten by the reader to whom the procedure sounds delightfully simple.) 
Top managenient is now in a position to say to all plant directors, "You 
can have as much of every input as you want. However, for every unit of 
input which you employ, the company will debit you with the dual value 
of the item" (presumably some more palatable term would be used). It 
is to be noted that our second duality theorem tells us that items which 
should be produced in an optimal solution (i.e., for which the optimal 
Q; > 0) will ordinarily be those which will yield no accounting loss 
(L; — 0). That is, only these items will yield unit profits sufficient to cover 
the dual value charge on tbe inputs used to produce them. Thus, if they 
are effective profit-makers, plant managers can be left to decide by them- 
selves which items should be included in the product line, for commodities 
which are not optimal from the point of view of the company will cause 
the plant to incur a loss.* The perfect plant manager will just break even 


4 There still remains the problem of determining how much of each profitable item to 
produce. Under linear programming assumptions which rule out diminishing returns to 
scale, all items which are profitable Should be produced in amounts as large as resources 
will permit. Unfortunately this still leaves a substantial calculation problem because 
increased production of one commodity will mean that fewer resources will remain avail- 
able for use in the production of other items. Thus, the dual pricing approach only takes 
care of part of the job of providing efficient decentralized decision-making arrangements. 
There are, however, somewhat related linear programming methods also involving dual 
prices, which, at least in principle, can handle the entire matter. 
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because the dual accounting prices are calculated so as to eat up all profits, 
but they will cause no loss only if all his product-line decisions are optimal. 
Thus dual pricing can, at least in principle, serve as a substitute for cen- 
tralized control and it can open the way to efficient decentralized decision- 
making. In fact, such an approach is now being employed in some industrial 
applications and, as will be indicated in Chapter 21, it has even been sug- 
gested as an appropriate device for the government to use in a socialistic 
economy, where, it has been implied, the method can help to attain the 


goals of socialism at a relatively low cost in central control and loss of 
individual initiative. 


5. Solution of the Primal and Dual Programs 


It will be shown in this section that our method for solving the primal 
linear programming problem immediately yields the solution to the dual 
problem without further calculation. To see how it does so we first write 
out a pair of small problems in our standard form as follows: 

Primal Dual 
Max II = 0+ PiQ, + P20, + PQ: 
subject to 


Min a = 0 + CW; + CV 


subject to 


Ui = Ci — auQi — axQ; — anQs Lı = —Pi + Vi + anVa 


Uz = €, — anQs — aQ: — anQ: Lı = — Pi + 2V1 + aV? 
all Q’s, U’s i 
Q nonnegative Ly Sh E E AA 
all V’s, L’s nonnegative 

Notice the differences in sign in the right-hand sides of the constraint 
equations of the primal and dual problems. This difference occurs because 
in our original constraint equations (3) the primal slack variables U; ap- 
pear with a plus sign while the dual slack variables L; appear with a minus. 
Now we may write the two problems in matrix form to obtain 


Primal Matrix Dual Matrix 
1 Qi Qe Qs 1 Vi Và 
Y= 0 Py P; Ps 
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These matrices exhibit fully the remarkable symmetry of the primal 
and dual problems. Neglecting differences in signs, the dual matrix is 
merely the primal matrix flipped over so that the lower left-hand and upper 
right-hand corners have exchanged places. In addition, there is a change in 
the sign of every element except those in the first column of the primal 
(first row of the dual) matrix. 

These observations suggest a measure which can achieve considerable 
economy of notation. All essential information can be preserved in the 
following combined matrix: 


(5) Ui = 


U: = 


eo “= = = 


Several essential characteristics of this combined matrix should be 
observed. 

1. The combined matrix (5) is simply the primal matrix in (4) with the 
dual variables entered along the right and lower sides to indicate their 
position in the dual matrix in (4). For example, since Lı appears beside 
the second row in the dual matrix in (4), it is written next to the second 
column in (5). From (5) we can at once reconstruct our dual matrix by 
just interchanging the rows with the columns and reversing the signs of all 
elements except those in the first column of (5). 

2. Each primal structural variable is paired with the corresponding dual 
slack variable, and each primal slack variable is paired with the corre- 
sponding dual structural variable. For example, the third column corre- 
sponds to variables Qs and Ls, while the second row corresponds to the two 
variables U, and Vi. We see that the primal and dual variables are always 
paired in exactly the same manner as they are in duality Theorem II. 

3. As usual, the primal variables appearing at the top of the matrix 
(5) take the value zero, in the current basic solution (i.e., Qu = Q» = Qs = 
0), while the primal variables at the left obtain their values from the first 
column of (5) (i.e. II = 0, U1 = Cs, Us = C2). In the original dual matrix 
in (4) the same rule holds, but because (5) is a flipped over representation 
of (4), things appear a bit inverted when we get to the values of the dual 
variables in (5). Here the zero-valued dual variables are those at the right 
(Vi = V2 = 0) and the values of the dual variables at the bottom are given 
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by the elements of the first row with their signs reversed (i.e., a = 0, Ly = — Py, 
L = —P2, La = — P3). For this reason, as a reminder, minus signs have 
been inserted before the dual variables which appear beneath the. matrix. 

These three characteristics of the combined matrix continue applicable 
throughout the simplex calculation. Now consider some pivot operation, 
say the one in matrix (5) of Chapter 5, which involves an interchange of 
the primal variables we now call Q, and U;. Suppose in the dual problem 
we were to pivot in a way which interchanges the corresponding dual 
variables Lı and Vj. It is easy to prove by applying the pivoting rules of 
Section 8 of Chapter 5 to the dual matrix that the new primal matriz and 
the new dual matriz again combine into a single matrix exactly analogous 
with (5)! That remarkable result enables us to go through the simplex 
procedure step by step simply by carrying out the calculation for the primal 
problem alone. As an example, we reproduce as combined matrices the 
three primal simplex matrices (5), (7), and (12) of Chapter 5 for our 
illustrative problem (3^) in that chapter: 


1st Matrix 2nd Matrix 
1 Q Q 


(6) 3rd Matrix 


a= -he -h= 


Each of these matrices satisfies all three characteristics of combined 
simplex matrices which have just been described. In particular, the first 
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matrix yields the basic dual solution V, = V» = 0, a = 0, Lı = —2.5, 
L = —2 (since — L, = 2.5 and — L; = 2, so that the values of Li and Lz are 
obtained by changing the signs of the elements in the first row). Similarly, 
the third matrix (which provided the optimal solution to the primal prob- 
lem) yields the dual basic solution Lə = Lı = 0, a = 8,750, Vs = 3, Va = 1. 

Indeed, this last basic solution is also the optimal solution to the dual 
problem. The optimal solution can always be found in this way, from the 
first row of the optimal combined matrix. For, as will be shown next, this 
dual solution simultaneously satisfies two criteria: 


The dual feasibility criterion, which requires all entries in the first row 
of the combined matrix (except for the element in the left-hand corner) 
to be negative; and the dual optimality criterion, which requires all elements 
in the first column of the combined matrix (except for the upper left-hand 
element) to be positive. 


First let us examine the logic behind the optimality criterion. We do 
so in several steps: 


1. It will be recalled that in a maximization problem the opti- 
mality criterion requires every element in the first row of the matrix 
(except for the profit figure) to be negative because then only prod- 
ucts with negative marginal profits will have been assigned zero 
output levels. Correspondingly, in a minimization problem (view it 
as a problem of cost minimization), the optimal values of the co- 
efficients of the dual objective function will all be positive because 
any process which has a negative coefficient (so that it reduces total 
cost) should be employed by the firm [it should not be represented 
by one of the dual variables at the top of our original dual 
matrix (4) ].* 

2. Because the combined matrix may be viewed as the dual matrix 
“lying on its side," the first row of the combined matrix yields the 
basic solutions of the dual, as we have seen, and the first column of 
the combined matriz gives the coefficients of the dual objective function. 
The reader should verify that the initial dual objective function 
in (6) is æ = 0 + 8,000V. + 9,000V; corresponding to the entries 
0, 8,000, and 9,000 in the first column of the first of our three com- 
bined matrices. 

3. We conclude that the optimality criterion for our dual problem 
requires all elements in the first column of the matriz to be positive. 
This is because the first column of the combined matrix corresponds 
to the first'row of the dual matrix, and these are the elements which 
do not change sign in the transition from the dual to the combined 
matrix [compare the dual matrix in (4) with combined matrix (5)]. 


5 For example, if our problem is to minimize a = 90 + 3V1 — 6Vs, we clearly do not 
want V: = 0. 
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The reader may well feel at this point that something has gone seriously 
wrong. True, the last of our three matrices (6) satisfies the dual optimality 
criterion, because the entries in the first column (excluding the o figure) 
are the positive numbers 3,750 and 500. But the other two matrices in (6) 
also satisfy the criterion—surely they cannot all be optimal! The difficulty 
is that the dual solutions corresponding to these first two matrices, while 
they do meet the optimality criterion, must be rejected because they are 
not even feasible. We saw, for example, that the solution proposed in the 
first matrix involves Lı = —2.5 and L = —2. But these figures clearly 
violate the nonnegativity requirements L, > 0, Lz > 0. Indeed, we can 
generalize these observations and conclude that any combined matrix 
which corresponds to a nonoptimal solution to the primal problem must 
yield a nonfeasible solution to the dual problem. For if the primal solution 
is not optimal, at least one of the elements in the top row of the matrix 
must be positive, and hence (after the required change in sign) the value 
of the corresponding dual variable must be negative. This observation 
also yields our dual feasibility criterion, which requires that all entries in 
the first row of the combined matrix be negative. 

In sum, when and only when we have found an optimal matrix for the 
primal problem, the corresponding combined matrix will yield an optimal 
dual solution. For in such a matrix the elements of the first column must 
be positive (to assure that the primal solution is feasible) and the elements 
of the first row must be negative (to assure optimality of the primal solu- 
tion). It follows that the dual solution must be feasible (because the 
elements of the first row are all negative) and optimal (because the first 
column elements are positive). 

Specifically, the solution given by the last matrix in (6) is, therefore, 
the optimal solution to our dual problem, and it has been read off directly 
from the optimal primal matrix without any further calculation! 

For the same reasons it is possible to solve the primal problem by first 
dealing with the dual problem, that is, by solving the dual problem, thereby 
obtaining the solution to the primal problem as a costless bonus. This is 
called the dual simplex method and was developed by E. C. Lemke. Which 
of these two methods it pays to use is a matter of convenience or computa- 
tional efficiency since the dual and primal problems are usually not equally 
easy to solve. 

The option of solving a problem either directly or through its dual has 
an important application in cases such as the primal problem (2) for which 
Qi = Q = +++ = Q, = 0 is not a feasible solution so that we cannot use 
this as our initial basic solution. Frequently in such cases the dual problem 
is not characterized by similar difficulties. For suppose in the primal prob- 
lem we have only “greater than or equal to” constraints, such as 0.6M + 
2T + 0.7R > 23 in (2). This requirement is clearly not satisfied by M = 
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T = R = 0. But in such a case the dual problems will have only “less than 
or equal to" constraints such as 9V, + 2V; < 400, which are satisfied by 
V; = V2 = 0. Hence, in such circumstances, it is possible to solve the 
primal problem by working only with the dual problem and starting off with 
the basic solution corresponding to the origin of the dual problem’s feasible 
region, where all the ordinary variables V, Vo,+--, Vm are equal to zero. 
In such a situation, we can avoid the additional work required by the use 
of the feasibility program which was invented to deal with problems in 
which the origin is not a feasible solution. 8 


PROBLEMS 


1. What is the dual basic solution given by the second matrix in (6)? 
2. Is this solution feasible? Explain. 
3. Given simplex matrix 


H =| 30 -2 —4 -1 1/1 
Q = 9 Dy 
Us m 7 Ys 


a= -V= -h= -V= 


(a) What is the corresponding solution to the primal problem? 
(b) What is the corresponding solution to the dual problem? 
(c) Show that both solutions are feasible. 

(d) Show that both solutions are optimal. 

(e) Show that duality Theorem I is satisfied. 

(f) Show that duality Theorem II is satisfied. 


See Appendix B to this chapter. However, the approach just described may not 
work if the primal contains both “less than or equal to" and “greater than or equal to” 
constraints, for then the same is likely to be true of the dual program, 

We can tell by direct inspection of the simplex matrix when it is possible to proceed 
by way of the dual in the manner just described. Feasibility for the primal problem 
requires all elements in the first column to be positive. Feasibility for the dual problem 
requires all elements in the first row of the combined matrix to be negative. If the first of 
these conditions is met, we can proceed by solving the primal problem directly. If only 
the second condition is satisfied, it is strategic to proceed via the dual. If it happens 
simultaneously that both conditions are satisfied, the reader should verify that we 


have found a matrix whose solutions are feasible and opti i 
plimal for both primal 
and there is then nothing further to compute. NORD, 
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4. Why can the following not be a pair of optimal solutions? 
Q=17, Q@=0, U=2, U—4  U—-0, 1-0, 


In = 5, Vi = 8, Ys — 0, V= 6. 


5. Solve the advertising problem (13) of Chapter 5, confirming the graphic solution 
M = 10,t = 3. 


6. Another Look at the Simplex Method 


Our dual theoretic approach to the solution of linear programming 
problems can also offer us new insights into the nature of the simplex 
method. In Chapter 5 we described this method as a sequence of steps in 
which we go from basic solution to basic solution, always seeking at each 
step to improve matters—to increase profits in a profit maximization prob- 
lem, or to decrease costs in a cost minimization problem. 

We can now obtain readily an alternative and rather illuminating way 
of looking at the simplex procedure. We have seen from our duality theorems 
that an optimal solution to a pair of dual programs is given by any pair of 
feasible primal and dual solutions that satisfies the so-called “complementary 
slackness” conditions 


QiLi—0, Q2le=0, 5, Qala = 0 
(7) 
U,V = 0, U2V2 = 0, SR S UmV m => 0. 


Thus, for every pair of values Q; and L; at least one of the two variables 
must be zero and for every pair U; and V; at least one of these variables 
must be zero. The strategy of the simplex method is to examine various 
combinations of values of the Q;, L;, U; and V;, all of which satisfy this 
complementary slackness property, and to find among them a set of values 
that satisfies the feasibility requirement of the primal and the dual prob- 
lems. It does so by starting from a set of values of the variables that satisfy 
the complementary slackness requirements, (7). If these are not feasible it 
then takes, say, some zero-valued Q, call it Qz, and next time around permits 
Q = 0 but instead sets L = 0. Simultaneously it does the same for some 
pair of variables U, and Vw. Such an interchange is precisely what 
constitutes a pivot step. 
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All of this is readily illustrated by the following schematic simplex 
diagram: 


Zero-valued variables 


1 Qı Q: swe Qn 
r= 0 P, Ps; P. 1 

U = C; —0n —4d1 — in V 
Zero- 

U: = C. — an — an s.w m V: valued 
variables 

Um = | C»  —Gm  — am —amn | Vm 

= H= dise -=,= 


Notice that each of the primal variables on top of the matrix and each of 
the dual variables on its right are assigned the value zero, while the corre- 
sponding variables on the left and the bottom of the matrix are, in general, 
nonzefo. Obviously, such a set of values satisfies the complementary slack- 
ness requirements (7). However, as we have seen in the preceding section, 
these values will not generally be primal and dual feasible, i.e., we will not 
generally meet all the requirements 


(8) 
pee py gop Py SO; 


for if this last set of feasibility conditions were all satisfied we would already 
have arrived at an optimal solution since our current basic solution 
satisfies (7). 


If the feasibility criteria (8) are not met, we then undertake a pivot step. 
Say, for the sake of illustration, that the pivot element is Gme- 
Tn the case the next simplex matrix takes the form 
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Qn 
r= 1 
U, = V 
U: =; Y; 
Q = L: 
a= -Lı = Vn = es =la S 


where the entries inside the matrix are not indicated. What has happened 
in this pivot step is that instead of 


Q:=0, L:-$0 — V, = 0, Um $0, as before, 


we now have 


In=0,Q2 50, Un=0, VnŚ0. 


In this way we have interchanged the assignment of zeros, as between two 
primal variables, Qz and U m, and as between two dual variables, Lz and V m, 
in such a way as to guarantee that the complementary slackness conditions 
will remain satisfied. In the process, the values of all of the nonzero-valued 
variables will have changed and perhaps they will now all be feasible and, 
hence, optimal. If not, one goes on to still another pivot step, and so on. 

d The strategy of the simplex method, then, may be summarized as 
follows: One goes from basic solution to basic solution for the primal and 
the dual simultaneously. The pivoting rules are designed to guarantee 
that in this process one never will double back on one’s steps—no basic 
solution will ever be encountered twice in the calculation process. The pivot 
steps are conducted in a way that also guarantees that the complementary 
slackness requirements (7) are always satisfied. Since there is only a finite 
number of basic solutions (corners of the feasible region), eventually this 
process must arrive at a solution that is feasible if any such solution exists 
(cf. Appendix A, Theorem 6). At that point the simplex method will have 
brought the calculation to the optimal solution that is desired. 
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APPENDIX A: ON THE DERIVATION OF THE DUALITY THEOREMS 


This appendix undertakes to indicate some of the approaches which 
have been employed in proofs of the duality theorems. In general, the 
mathematical arguments are intended merely to be suggestive rather than 
rigorous. 

Before getting to the theorems themselves we establish a key preliminary 
result due to A. W. Tucker, from which most of our duality theorems can 
readily be derived. Given any feasible solutions of the primal and, dual 
problems, this proposition describes a general relationship between the 
values of the primal and dual objective functions II and @ in terms of the 
other variables of the two problems: 


(1) a—I- zi U:V: + L Q;L;. 


In words, this equation states that for any pair of feasible solutions, 
the value of the dual, o, exceeds the value of the primal by the sum of the 
values of the primal slacks U;, each multiplied by the corresponding dual 
ordinary variable V; plus the sum of the primal ordinary variables Q;, 
each multiplied by the corresponding slack variable. 

To derive this equation we must examine the constraints of the primal 
and dual problems which establish the requirements of feasibility. In slack 
variable form these constraints may be written 


Primal Dual 
32a; +U= 0 »> Vian — Ly 
j i 


P 


(2) à T Bes ; : s j 
E amQ + Un = Cm È Vitin — Ln = Pr 
i ; 
Multiplying the first primal equation through by Vi, the second by 
Vz, ete., and then multiplying the first dual equation by Qu, etc., we obtain 


2x. ViaQ; + UiVi = CV 2j VianQ: — LQ: = PiQi 


(3) . . . . . . 


Y Voss; UnVn = OnVn E Van. — Laa = P.O 
j $ 


Now adding together the primal constraints in (3) and remembering the 
dual objective function, 
(4) a= > CV, 

i 


—————————— M IP T] 
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we obtain 
(5) E 22 VQ; + 32 UV; = OV: = a. 

i i i a 


Similarly, by adding together the dual. constraints in (3) and using the 
primal objective function, 


© I= PQ, 
we have 
(7) E E Vas; - È Ly ;- 2; P,Q; = II. 


Since the first terms in (5) and (7) are identical, we obtain the required 
relationship II + X; LRQ; — « — 2: U;V;, which is clearly equivalent to 
the Tucker equation (1). 


From (1), we observe at once 


Theorem 1: For any feasible solution to the primal problem and any (not 
necessarily related) solution of the dual we have for the values of the 
objective functions a > II. This follows from (1) because feasibility requires 


(8 L20 Q;20, U;20, and V;>0 foral and j. 
A simple extension of the argument gives us 


Theorem 2: If, for some particular pair of feasible solutions, we have a* = 
II*, those feasible solutions of the primal and dual problems must also be 
optimal. Incidentally, we shall see presently that the converse of Theorem 2 
is also valid—for any pair of optimal solutions we must have a = II. 

To derive Theorem 2, observe that if any other feasible solution of the 
primal problem yields a value of the objective function II**, we must have, 
by Theorem 1, II** < a*, so that, by hypothesis, any such II** « II*, 
i.e., II* must be the maximal value of II. The proof that o* is the minimal 
value of a is precisely analogous and the reader should go through it as 
an exercise. 

Viewed intuitively, Theorem 1 has, in effect, established any « as a 
ceiling on possible values of II, and the logic of Theorem 2 is that a II which 
reaches such a ceiling (any II* = a*) must be maximal. A similar argument 
applies to a* with any II acting as a floor to the possible values of a. 
1t is Theorem 2 rather than its converse (which is more difficult to prove) 
that is most useful in application. For Theorem 2 tells us that if by means 
of the simplex method we find any pair of feasible solutions for which r = a, 
we can be sure we have arrived at a pair of optimal solutions. 
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Next, from (1), we deduce 


Theorem 3: Any pair of feasible solutions will yield II = « if and only if, 
for each i and each j U;V; = 0 and L,Q; = 0. 

Proof: By (8) if any L;Q;z^ 0 that product must be positive and, 
similarly, if any U:V;# 0, we must have U;V; > 0. In either such case 


> 19; + UV: 


must be positive because it is a sum of nonnegative terms not all of which 
are zero. Thus, if and only if such a situation occurs we must have «a > II. 
Q.E.D. 


Theorem 4: If a pair of solutions is optimal, they must satisfy a = II. 


This is the converse of Theorem 2 and states that in every pair of optimal 
solutions the values of the primal and dual objective functions must 
coincide. 

For a rigorous proof we refer the reader to any more-advanced book on 
programming. However, a brief intuitive discussion may be illuminating. 
Consider any pair of feasible solutions to a primal and dual program for 
which a = r. Can these be optimai? We will suggest now why they cannot. 
A glance at the key Tucker equation (1) indicates that if a ~ t we must 
have, since all the variabie values are nonnegative, either 


U,V; > 0 for some input, i, or 
Q;L; > for some output j, (or both). 


The first of these inequalities implies that we are not using all of the avail- 
able quantity of input ? (U. > 0) even though its marginal yield is positive 
(V; > 0). Such a solution clearly cannot yield maximum profits. Similarly, 
if Q;L; > 0, we must be producing an output, j, which incurs an opportunity 
loss (2; > 0}, aliso clearly violating 2 fundamental requirement of optimal- 
ity. Hence, eny solution for which e # 7 is seen necessarily to fail the 
optims! requirements that we long ago learned from marginal analysis. 
Thus, only if a = 7 can the solution be optimal, just as Theorem 4 asserts. 

We some now to the very important theorem that confirms the legiti- 
macy of our interpretation of the dual variable V; as the marginal yield of 
input i. Specifically, we have. 


Theorem 5: If the derivative à79/8C; exists, we must have 


Unfortunately, a rigorous proof of this result is rather difficult. A graphic 
argument can be utilized to indicate the validity of the result and to suggest 
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the character of the extension of Proposition 5 to the important class of 
cases where d7°/dC; does not exist, but even that is too lengthy to be 
appropriate here.? 


Theorem 6: If there is a feasible solution to the primal problem and a. 
feasible solution to the dual problem, then both problems also possess an 
optimal solution. 


Outline of Proof: There are only two ways in which a programming prob- 
Jem can fail to have an optimal solution: (a) if it involves a contradiction so 
that it possesses no solution, or (b) if the value of the objective function 


7 For these materials see M. L. Balinski and W. J. Baumol, “The Dual in Nonlinear 
Programming and its Economic Interpretation," Review of Economic Studies, Vol. XXXV, 
July 1968. 

In intuitive terms, in trying to find a7°/dC; we are investigating what happens to 
potential profits as we increase the capacity, Ci, of the firm's ith input, i.e., as we "ease" 
the firm's ith constraint, aiQi + --+ + ainQn < Ci. Obviously, an increase in the avail- 
able capacity of input 7 will generally increase the profits the firm can earn. Specifically, 
it is not difficult to show that in a linear programming problem the relationship between 
7? and C; will have the piecewise linear shape shown by curve oabed in Figure 1, where 
the graph r° = f(C;) increases monotonically with C; but gradually levels off (diminishing 
returns to input 7 alone). Now, if the actual value of C; in a numerical programming 
problem happens to be represented by a point like Cx that does not lie below a “kink” in 
the graph, then the derivative, d79/aC;, clearly exists and we can show that in this case 
the value of the dual variable, V;, will be equal to that derivative, i.e., to the slope of the 
segment bc. On the other hand, suppose that in a numerical problem the value of C; is 
like that shown by Ca, i.e., it lies below a kink, b, in the profit curve. Here the slope of the 
profit curve is no longer defined. However, even for this case we can show that V; will lie 
somewhere between the slope of segment ab and the slope of segment bc, i.e., between the 
value of a7°/dC; just to the left of point b, and its value just to the right of b. 
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is unbounded so that there is nothing to prevent it from “becoming infinite." 
[Examples of problems which have no solution: case a; Max II = 6z + 3y 


subject to 
z>5 and z—2 «2. 


Since z cannot be both greater than 5 and less than 4, this problem has 
no solution. 


Zxample of case b; Max II = 6z + 3y 
subject to 
zz. 


Here x can be increased without limit and so, therefore, can II. There is, 
then, no finite maximum value of II.] But if both the primal and dual 
problems possess feasible solutions, neither of them can involve a contradic- 
tion (condition a, above). Moreover, let II* and a* be the values of II 
and e corresponding to these solutions. Then by Theorem 1, above, for 
any solutions II < «* and II* < o. Thus it is impossible for II to increase 
without limit or for æ to decrease without limit, i.e., both II and a are 
bounded (condition b). Thus both our conditions for the existence of an 
Optimal solution are satisfied and so both the primal and dual problems 


must have an optimal solution. 
We come now to the Lagrange multiplier theorem. 


Theorem ?: The value of II = $ P,Q; will be maximized subject to the 
inequality constraints > a;jQ; < C; if and only if the unconstrained 
Lagrangian expression 


(9) In — 2 PR; — zÉ NO a;Q; — Ci) 


also attains its maximum, and where the ith Lagrange multiplier, ^; = V?, 
is the optimal value of the ith dual variable, so that the Lagrangian 


expression (9) becomes 


(10) Ih = 2. P-L 2 Via4Q; + 2; Ov’, 


Moreover, the corresponding Lagrangian function for the dual problem 
is identical with (10) except that in the œ, expression it is not the V; but 
the Q; which are assigned their optimal values, Q?. 

This theorem tells us, among other surprising things, that the La- 
grangian method of the differential calculus can also be applied to linear 
programming problems despite the fact that the constraints are inequalities 
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and that they have not been changed into equations by the insertion of 
slack variables! 

The correspondence between the Lagrangian expression for the primal 
and dual problems can readily be verified by the reader by forming the a 
equation just as we derived that for II, in (10). Let us ignore for the mo- 
ment the requirement that any variables take optimal values in (10). 
This gives us a generalized Lagrangian expression which we denote by 
L(Q, V), i.e., 


(11) L(Q, V) 


© Pj; - 22 32 Vas + 2: C; 
=i- YD ViaiQ; + o. 
Clearly, by (10), L(Q, V?) = Th, and, similarly, L(Q°, V) = a. Now we 


shall prove that any value of Q which maximizes II subject to our primal 
constraints must also maximize I = L(Q, V°). 


By (5) and (7), we have for any « and any II 
a> 2, 2, Via; = Il, 


and even for the optimal V; 


(12) D E Views = n. 
But by Theorems 2 and 4, for optimal solutions 
(13) a = 3; E Viagi = D°. 


Thus, by (12) and (13), 1 — E E Vta4Q; < M- È 22 ViawQ. 
Hence comparing 

L(Q, V?) = 11 — 95 2; ViasQ; t o" 
L(Qs V?) = m — E 22 Views + o 


we see at once that 


with 


L(Q» V») > L(Q, V°). 


Thus, it follows that the values Q? which maximize II must also maximize 
the value of the Lagrangian expression L(Q, V?) = Th, which is what we 
were to prove. 

The corresponding result for the dual problem is left as an exercise for 
the reader. The converse of the theorem [ie., any values of Q which 
maximize L(Q, V?) must also maximize II subject to our constraints] is a 
little more difficult to prove and the derivation will therefore not be given 
here. 

This theorem will be referred to again in the discussion of the Kuhn- 
Tucker theorem in the next chapter. However, a geometric interpretation 
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L(Q,v) 


Figure 2 


may first be indicated briefly. Our theorem states that the optimal Q; will 
maximize Y, = LQ, V°). Similarly, the dual formulation of the theorem 
states that the optimal V; will minimize a, = L(Q°, V). Hence, for V; = y? 
we will have L(Q, V) maximal for optimal values of the Q’s and for Q ye? 
it is minimal for optimal values of the V's. Our situation is as that shown in 
Figure 2 for the trivial case where there is one primal variable, Q, and only 
one dual variable, V. S is the point on the L(Q, V) surface corresponding 
to the optimal value Q° of the primal variable and the optimal value V? 
of the dual variable. Moreover, S is the lowest point in the valley DSD' 
which is obtained by varying the value of V. Finally, S is the highest 
point on the hill HSH’ to which LQ, V) can be brought by varying the 
value of Q. It should be obvious from the shape of the surface why S is 
called a saddle point. 


APPENDIX B: THE INITIAL BASIC SOLUTION 
AND THE "FEASIBILITY PROGRAM" 


We have seen that in many cases in which the origin is not a feasible 
solution for a program, rather than solving the original problem, one can 
instead solve its dual. This is possible wherever the origin is a feasible solu- 
tion for the dual. However, there can arise cases in which neither for the 
primal problem nor the dual program is the origin a feasible point. First we 
will see how such a case can arise. Then a method by which one can solve 
such a problem will be described. 

If we look again at the primal and dual matrices (4), we see that the 
origin is a feasible solution for the primal if and only if all the C; > 0. For, 
then, if we set all Q; = 0 we obtain the feasible solution U; = C, 2 0, 
U: = Cz = 0, ete. Similarly, from the dual matrix in (4) we see that its 
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origin is a feasible solution for the dual if and only if all P; < 0, for then at 
the origin, where all the dual structural variables, U;, are equal to zero, we 
have the feasible solution Lj = —P, > 0, L: = —P; > 0, etc. 

Hence, we can solve either the primal or the dual problem in the normal 
manner unless there is at least one C; € 0 and at least one P; 2 0, as the 
reader can verify by writing out such a problem in algebraic form. There 
will be no such difficulty if the maximization problem of the pair of dual 
programs contains only the normal sort of capacity constraints in which 
the use of resources must be less than or equal to their capacity. Similarly, 
it does not arise in the usual sort of minimization problem in which all the 
constraints are minimum acceptability conditions such as "'the advertising 
program must reach at least 80 million people aged 18-40.” 

Where, however, constraints of both varieties enter a single problem, 
difficulties may arise. For example, suppose the firm's objective is to maxi- 
mize its profits r = 9Q: + 2Q: from the quantities, Qı and Q», of its two 
outputs, subject to a warehouse capacity constraint: 


3Q + @ < 6 


and a contractual constraint specifying a minimum total output of the two 
items that must be delivered: 


Q TQ 23. 


Q2] Ve 
PRIMAL PROGRAM DUAL PROGRAM 


| Feasible 
Regions 


Figure 3a Figure 3b 
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Figure 3a shows the (shaded) feasible region for this program and Figure 3b 
shows the feasible region for its dual. It is obvious that the origin is not 
feasible for either program. 

Next, if we put the slack variable U; > 0, Uz > 0 into the two con- 
straints we obtain readily 


U; = 6 —3Qi — Q2 
(1) 


U2 = —3 + Qı + Qə, 
so that the simplex matrix for our program becomes 


1 Qı Q2 


(2) U, = 


U: = 


Since this contains the negative number —3 in the first column, we see again 
that the origin is not a feasible solution for the primal problem. Similarly, 
since it contains the positive numbers 9 and 2 in the first row, the origin is 
not a feasible solution for the dual. Neither problem, then, can be solved by 
the usual procedure that starts from the origin as its initial feasible solution. 

If a problem taking this inconvenient form does possess at least one 
feasible solution,’ there is an ingenious procedure invented by Michel 
Balinski which enables us to deal with the programming calculation in a 
fairly straightforward manner.? What the procedure does is to take a 
nonfeasible simplex matrix such as (2) and by a sequence of pivot steps 
transform it into a matrix giving one of the feasible basic solutions of the 
problem. Once this is done one can proceed to find the optimal solution of 
the problem by applying the normal simplex procedure to the feasible 
matrix found through the Balinski procedure. 


8 It is conceivable that it might not have any feasible solutions. This will occur if 
the constraints impose inconsistent requirements. For example, a program obviously 
will have no nonnegative solution if it contains the constraint 


Qi + 3Q < -2 


or if it has the pair of constraints Qı < 5, Qı > 12. Of course, where the set of constraints 


is large and involves many variables, inconsistencies are far harder to detect than they 


are in trivial and transparent cases such as the preceding. 
? The discussion that follows is fairly specialized and the reader may prefer to 
omit it. 
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The nature of the method is best explained by example: 


Step 1: Rewrite the matrix moving all rows with negative first entries 
(i.e., the infeasible portions of the solution) to the top of the new matrix. 
Thus, since in matrix (2) only the last row has a negative entry, we move 
that last row to the top of the matrix to obtain the new matrix 


1 Qi Q2 


(3) c= 


This changeover is carried out just for convenience and obviously 
involves no substantive change in any of the relationships of the program. 

The objective now is to rewrite the problem, if possible, in a way that 
gets rid of all the negative entries in the first column, since it is these 
negative entries that are the source of the nonfeasibility in our initial 
solution. 


Step 2: We now treat the first row in the new matrix as a psuedo- 
objective function and begin moving toward the maximum value of this 
function by the usual pivoting steps of the simplex process. This again 
produces no substantive change in the programming problem itself because, 
it will be recalled from Chapter 5, a pivot step only rewrites the equations 
and inequalities of the program expressing one subset of variables in terms 
of the others. Just as the equation y = 3z + 6 is equivalent to x = y/3 —2, 
3 linear program after a pivot step remains the same in substance and 
changes only in form. 

The purpose in moving toward a maximum of the value of the first 
entry in the matrix (which started out negative) is to see whether there is 
any pivoting transformation of the program which replaces that negative 
entry by some positive number. In terms of matrix (3), we want to pivot in 
a way that will replace the —3 = U% in the upper left-hand corner of 
matrix (3) by some positive number, and the obvious way to go about it is 
to moye toward the maximal value!? of Us. We do this, as usual, by 
picking a pivot element from a column with a positive top entry, say the 


10 Tf it were, unfortunately, to turn out that the mazimal value of Uz were only, 
say, —0.6, then obviously no positive replacement for the negative first entry would be 
possible, and in such a case it would follow by definition that the problem in question 
possessed no feasible basic solution (no basic solution with all entries in the first column 
positive). 
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last column in matrix (3). However, the choice of pivot row in the Balinski 
process has two additional provisos: (a) Never take a pivot element from & 
row corresponding to the objective function of the original problem [row 2 
of matrix (3)],!! and (b) never take a pivot element from a row with a nega- 
tive first entry [there happen to be none other than row 1 in illustrative 
matrix (3)].!? 

We now pivot in the usual way, on the pivot element —1, obtaining as 
our new simplex matrix (as the reader can verify) 


(4) r= 


Because of the simplicity of the illustrative problem, this one pivoting 
step has yielded us a feasible solution because there are no longer any 
negative entries in the first column. If any had remained, we would simply 
have repeated our procedure, bringing another row with a negative entry 
to the top of the matrix (step 1) and moving toward a maximum of this 
new pseudo-objective function (step 2).18 


11 We avoid this since such a pivot element would move =, the objective variable, 
over to the top of the matrix, i.e., to the right-hand side of the equations or inequalities 
of the program, and it is never appropriate to do that since we always want to express 
a in terms of the other variables, not the other variables in terms of r. 

12 We never take an entry from a nonfeasible row because the pivot row is chosen 
so as to make certain that the pivot step does not transform any initially positive first 
entry into one which becomes negative after the next pivot step (see footnote 19 of 
Chapter 5). That is, the pivot row is selected in a way which prevents the introduction 
of a nonfeasible variable value into the solution by a badly chosen pivot step. But to 
prevent this we need only consider those variables whose values were nonnegative 
before the pivot step in question, i.e., those rows whose first entries were nonnegative 
to begin with. 

13 [t may seem at first blush as though this further pivoting step might reintroduce 
negative entries into the first column in places where they had previously been elim- 
inated. That is, suppose in (4) the last entry had indicated Qz = —6. Might not the 
process of getting rid of that minus sign reintroduce a negative value for Uz which we 
have just gotten rid of? It is not difficult to show that such a mishap can never occur if 
the pivoting process is carried out correctly, for it is proved in footnote 19 of Chapter 5 
that if the pivot clement is chosen by the rules of Section 9 no pivot step will ever 
replace a positive entry in the first column by a negative one. 
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In our case, since (4) is now a feasible solution we now move on to 


Step 3: Move the row corresponding to the true objective function 
[the second row in (4)] back to the top of the matrix, obtaining 


1 Q Ui 


(5) Us; = 


Q: = 


Step 4: Now, if the solution is not optimal, proceed with the usual 
simplex calculation starting from the feasible simplex matrix that the 
Balinski procedure has provided. In our example, (5) is obviously not an 
optimal solution since the second entry in the top row is still positive. We 
therefore select a pivot element from that column and continue on with the 
simplex calculation until the optimum is found. 

The procedure is really quite straightforward and appears to be rather 
efficient computationally. To summarize, what the Balinski procedure does 
is to move us from a basic solution which is not feasible (the origin) to 
some other basic solution that is feasible so that we can apply the usual 
simplex procedure to its matrix. It does this by taking the variables with 
initially nonfeasible values (those whose values are initially negative) and 
substituting for the initial programming problem an equivalent problem 
whose (pseudo-) objective is to maximize (in turn) the values of those 
variables whose initial values are negative. It does this by moving the 
corresponding rows to the top of the matrix and proceeding by the usual 
pivoting. steps to increase the values of the offending variables until all 
negative values have been eliminated. 


PROBLEM 


Verify from (2) and from the dual of (2) that points A and B in Figures 3a and 
3b are in fact the respective optimal solutions of the primal and the dual. 
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Nonlinear Programming* 


] 


In economic terms nonlinear programming may be described as 
the analysis of constrained maximization problems in which diminishing 
or increasing returns to scale are present. For example, by doubling all its 
inputs a firm may find that it can increase its profits by only 47 per cent. 
This may occur because, for some reason, its physical outputs cannot 
keep pace, or because it becomes increasingly difficult to sell additional 
outputs so that selling costs yield diminishing returns and/or the prices of 
its products fall. 


1. Algebraic Notation and Example 


In a nonlinear program the algebraic expressions which occur in either 
the objective function (e.g., the profit or cost relationship) or in the con- 
straints or both will involve nonlinear terms such as X? or 5? or cos X. 
Thus, rather than representing the profit relationship by a simple linear 
expression such- as 5X + 3Y + 7Z, we use the more general functional 
notation total profit — f(X, Y, Z), which states simply that profit is 
dependent in some way on the quantities of the three outputs, X, Y, and 


* Some of the material in this chapter may be considered conceptually difficult. 
However, it is recommended that readers of the two preceding chapters read at least 
Sections 1 through 6, for only by contrast with nonlinear programming can the limita- 
tions and peculiarities of the linear programming case be fully understood. 
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Z. Similar notation is used for the constraints, so that the general nonlinear 
programming problem may be written in the usual three parts: 
1. Objective function 
maximize (or minimize) f(X, Y, Z, ++») 
subject to 


2. Constraints 


and 
3. The nonnegativity requirements 
X20, Y>=0, Z>0, °°. 


To show how such a nonlinear programming problem can arise in an 
economic problem, consider the case where the unit profits P. and P, of 
X and Y are fixed, say P. = 5 and P, = 3. Suppose, however, that be- 
cause of customer resistance to increased purchasing (a negatively sloping 
demand curve) the unit profits of Z, P., fall continuously as more of this 
commodity is offered for sale, in accord with the simple linear relationship 


P, = 200 — 0.0052, 


which states that every time another thousand units of Z are offered for 
sale, its price falls 5 cents. Substituting this expression for P, into the 
objective function, that function becomes 
total profit = P.X + P,Y + P.Z = 5X + 3Y + (200 — 0.005Z)Z 
= 5X + 3Y + 200Z — 0.00527. 
Note that this contains a Z?, so it is no longer a linear relationship 
despite the fact that the demand expression which was substituted for P, 
is linear. 


2. Geometric Representation: Nonlinear Constraints 


It is convenient to consider the effects of nonlinearities on the graph 
of a programming problem in two separate stages: (1) the effects of non- 
linearities in the constraints, and (2) the effects of nonlinearities in the 
objective function. 
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Let us first examine the constraints. Suppose the problem involves, 
for example, the inequality 
X*--Y?« 1. 
As in the linear programming case, this divides combinations of X and Y 
into two classes: Feasible (those which meet the conditions specified by the 
inequality) and infeasible (those combinations of X and Y which violate 
the inequality). The border line between the region of feasible points and 


o [0] 


(a) (b) 
Figure 1 


those which represent outputs that do not satisfy the inequality is again 
given by the requirement that equality hold in the constraint: 


X'--Ytz1, ie, Y = vi- X.. 


Any point on the graph of this equation represents a combination of X 
and Y which just manages to squeeze in “under the wire"—it just barely 
satisfies the inequality. By trying various values of X and computing the 
corresponding values of Y, or by more sophisticated means, it can be seen 
that the graph of this relationship is that shown in Figure 1a. Furthermore, 
if we require X and Y to be nonnegative, the feasible output combinations 
are represented by the shaded region in this diagram. A second nonlinear 
inequality might reduce the feasible region to that depicted in Figure 1b. 

We see, then, that nonlinear inequalities trace out a feasible region 
just as do linear inequalities. However, when they are nonlinear the 
borders of the feasible region will consist, at least partly, of curved lines. 


3. Geometry of Nonlinear Objective Functions 


For reasons analogous to those just described, the graph of a nonlinear 
objective function is not a plane as in the linear case. Instead profit fune- 
tions may be hills or valleys or of totally irregular shape. 
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Several such profit relationships and the corresponding iso-profit curves 
(profit indifference curves) are illustrated in Figures 2, 3, and 4. Figure 2a 
represents the "best-behaved" type of profit function. For reasons which 
are discussed later in this chapter, such a function makes life easier for 
the programmer than does the presence of other types of nonlinear profit 
functions. 

This can be described as a diminishing-returns case—the curvature of 
the surface (its upside-down U-shaped cross sections) indicates that 
increases in output yield diminishing marginal returns (see Section 4 
of Chapter 11. Indeed, increases in output beyond the profit-maximizing 


(a) (b) 
Figure 3 


144 Nonlinear Programming Chapter 7 


point M must yield diminishing total returns, i.e., such an increase in 
output must obviously reduce total profits. 

The iso-profit curves of nonlinear profit functions are not usually parallel 
straight lines. In this hill-shaped profit function case they are closed curves 
which lie inside one another. It is also to be noted that, unlike the linear 
case, the direction of profitable movement can change. For example, in 
Figure 2b a movement upward and to the right (an increase in both out- 
puts) will sometimes increase profits and sometimes reduce them. Thus 
the move from A to B adds to total profits, but a move from T to C reduces 
them. 

An extreme case of this phenomenon is depicted in Figure 3. Here the 
iso-profit curves have the traditional shape of indifference curves, but 
there is one difference. Profits increase as we move up from H to J to K. 
But curve K corresponds to the 
maximum profit ridge MM’ in 
Figure 3a, so that if we move 
further away from the origin, say TOTAL 
from an output combination on K PROFIT 
to one on L, profits actually fall. 

Finally, Figure 4 depicts an 

important case, that of increasing 
returns to specialization, where 
higher values of, say, X bring in 
ever-increasing marginal yields 
(the upward curvature of OCT"). 
After a preliminary discussion of 
Some geometric concepts we shall 
return to examine some of the 
problems which such a situation 
involves. 


4. Convex and Nonconvex Regions 


The shaded region in Figure 5a is of a variety which is called convex. 
That is to say, it has no dents or holes as does the nonconvex region in 
Figure 5b. More precisely, a region is defined to be convex if the straight 
line connecting any two of its points lies entirely inside the region. Thus, 
if we draw the line connecting any two points, such as M and R in Figure 
5a, that line will never leave the shaded area. But if in 5b we try to connect 
points A and B or B and C in this way, our lines will have to traverse 
some of the unshaded part of the diagram. 

The distinction between convexity and nonconvexity of the feasible 
region is important for programming. Nonconvexity can make it more 
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difficult for us to find the optimal point, and it can also make it more 
difficult for us to recognize the point to be optimal when we get there. 

1. Testing for optimality. Suppose, for the moment, that the objective 
(profit) function is linear so that the iso-profit curves are parallel straight 
lines. Then, in a feasible region which is convex, if we find a point such as 
M from which any small move decreases profit, we can be sure that this 
point is a global optimum, ie. no move of any magnitude will bring in 
profits higher than those at M. M isa point of “tangency” between iso-profit 

. curve PP’ and the boundary of the feasible region. 

Because of the convexity of the region, its boundary curves further 

and further away from iso-profit line PP’. Thus if a small move away from 


(b) 


M, say to N, reduces profits, we can be sure that any further move in 
that direction, say to Q, will reduce profits still more. 

However, this result does not hold for the nonconvex feasible region 
in Figure 5b. There, a move from point of tangency A over to D does 
indeed reduce profits. But if we are patient and nevertheless follow along 
the boundary of the feasible region, it may begin to curl back upward 
again (point E) and eventually we may even reach a point B which is 
also a point of price line-feasible region tangency and which yields profits 
far higher than those at tangency point A. A point like A which yields 
higher profit than any other feasible point in its vicinity is called a local 
optimum, whereas the point B which really yields maximum profits is a 
global optimum. 

2. Finding the optimum. A related problem produced by nonconvexity 
js that it is more difficult under these circumstances to design an iterative 
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procedure to find the global optimum. In the convex case (Figure 5a) any 
move which increases profits (e.g., the move from Q to N) is certain to get 
us closer to the optimal point, M, because there is only one profit hilltop 
and any uphill move must bring us nearer to it. But in the case of the 
nonconvex region a move which increases profits (e.g., the move from E 
to D in Figure 5b) can move us toward the wrong hilltop, A, and away 
from the true optimum, B. Hence in such a case an iterative procedure 
which is designed always to increase profits may very well fail to lead us 
to the global optimum. 

What is the significance of these results? It is, in effect, that near- 
sighted mathematicians cannot be trusted to solve programming problems 
in which the feasible region is nonconvex. Any procedure that tells us to 
test for maximum profits by checking whether a few Steps in any direction 
reduce profits can be considered to involve such & myopie approach. 
An example of such a procedure is the simple requirement of the differential 
calculus that the second derivative of the function whose value is to be 
maximized be negative at the maximum point. For this condition merely 
states that any move to a point very near the maximum point results in a 
reduction in the value of the objective function, so it is a satisfactory 
condition only where nearsightedness is no handicap. In other words there 
are optimality tests and resulting economies of calculation which are in- 
applicable to problems involving nonconvex feasible regions. 

It is important to note that in a linear program the feasible region is 
always convex.! That is why, in the simplex method, to see whether some 
corner C of the feasible region is optimal it is only necessary to test the 
profitability of a move to one of the corners adjacent to C. For if a move 
to any such corner reduces profits (or increases costs) then any further 
moves must certainly be disadvantageous. In other words, the simplex 
method can be classified as a nearsighted calculation method. 


1 To demonstrate this it is only necessary to show that if S (Figure 5a) is any point 
on the line connecting two feasible points such as M and R in the feasible region of a 
linear program, then S is also feasible (it lies in the shaded region). But S represents an 
output combination which can be obtained by a suitable scaling down of the activities 
at M and R. For example, if S is the midpoint of line MR, it represents the sum of half 
the outputs at M and half the outputs at R (Xz is the midpoint of Xp and X, etc.). 
Now consider any scarce facility k, whose total capacity is K. Since M is feasible, its 
production must require no more than K units of this facility, and the same must hold 
for output combination R. But because a linear Program involves constant returns to 
scale, it follows that half of the outputs at M or at R require no more than 4K units of 
facility k in their production. Hence S will require in its manufacture no more than 
4K + 4K = K units of k, i.e., no more of this scarce resource than is available. A sim- 
ilar argument holds for every scarce resource needed to produce S, and it therefore 
follows that output S must be feasible. 
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5. Concave and Convex (Objective) Functions 


A classification somewhat analogous to that just described for geo- 
metric regions also holds for the graphs of functions. A function whose 
graph is like that in Figure 4 is called convex while that in Figure 2a is 
called concave. More precisely, a function is called strictly concave if, when 
we draw a straight line connecting any two points A and B on its graph 
(Figure 62), the whole of the arc AB, excluding the end points, lies above 
the straight line AB. The function is called concave (but not strictly 
concave) if its graph contains some linear stretches so that a straight line 
such as A’B’ (Figure 6b) may coincide with arc A'B 


f(x) 


(a) (b) 
Figure 6 


Convex and strictly convex functions are defined analogously, only here 
the connecting arc always lies below or coincides with the line connecting 
two points on the graph. It should be noted that a linear function, whose 
graph is a straight line, a plane, or a hyperplane (n-dimensional analogue 
of a plane), is always both concave and convex. 

Objective functions of the "wrong" shape can lead to the same problem 
as nonconvex feasible regions—they can produce local optima which are 
not global optima so they can invalidate myopic computational procedures 
and procedures for testing for optimality. But in this case, which shape is 
wrong depends on the nature of the problem. As we shall see now, à well- 
behaved objective function in a maximization problem is apt to be badly 
behaved in a minimization problem, and vice versa. 

A moment’s thought indicates why this is so. In a maximization prob- 
lem, if the function is hill-shaped (concave) all uphill roads lead to the top. 
In Figure 2a we can confidently proceed by moving upward in any direction 
because all uphill paths end up at the peak. Hence any trial-and-error 
(iterative) procedure that keeps trying successive output levels which 
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are more profitable than those in the previous attempts will (if it does 
not move up too slowly) eventually get us to the maximum profit output 
combination. Moreover, if we are at a point M from which any small 
move takes us downhill, we know that this point must be the true optimum. 
The myopic computational techniques and optimality tests both work. 

But if in a maximization problem the graph of the objective function 
contains a valley (it is convex), going in an uphill direction is not guaran- 
teed to get us to the top. If we start uphill from point A in Figure 4, we 
may land up in point B instead of point C, the global optimum in the 
shaded feasible region. 

In a minimization problem it is easy to see that the situation is re- 
versed—a valley is desirable and a hill is troublesome from the point of 
view of computation. 


We may sum up by stating that nearsighted computational techniques 
can be used in a maximization problem if the feasible region is convex and 
the objective function is concave. In a minimization problem such methods 
can be employed if the objective function and the feasible region are both 
convex, 


There is another way in which the economist can look at this matter. 
A graph of a concave function such as Figure 2, if it is a profit function, 
represents a situation involving diminishing returns, as already noted. But 
if the figure were to represent a cost function, it should be clear that it 
would be one involving increasing returns (decreasing marginal costs as 
output expands). Similarly, convex Figure 4 represents either an increasing- 
returns profit function or a diminishing-returns cost function. Thus a 
well-behaved objective function in either the maximization case (concave) 
or the minimization case (convex) involves diminishing returns. We can, 
then, restate our main theorem as: 


Myopic computational and optimality testing techniques can be used 
when the problem involves a convex feasible region and diminishing 
returns. 


The role of diminishing returns in this proposition is easily visualized 
intuitively—if any departure from a local optimum always involves di- 
minishing returns, then going further and further only makes things worse, 
so that our local optimum must be a global optimum as well. Indeed, this 
reasoning can be carried further to indicate that diminishing returns will 
tend to produce convexity in the feasible region. The stretch EF in the 
nonéonvex feasible region in Figure 5b clearly involves increasing returns 
to output (increasing marginal profits as output increases). If diminishing 
returns held throughout, the boundary of the feasible region would curve 
further and further away from the iso-protit line through local optimum 
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point A so that no other optimum point such as B would be possible. 
Therefore, only if we have diminishing returns throughout the feasible region 
are myopic computational techniques generally legitimate. This is the final, 
most compact form of our theorem. 


6. Nonlinearities and the Basic Theorem of Linear Programming 


It will be recalled from the preceding chapter that there is a central 
theorem of linear programming which states that the number of variables 
(including slack variables) whose values are positive in an optimal solution 
will ordinarily be equal to the number of constraints in the problem. In 
other words there will always be a basic optimal solution to any solvable 
linear programming problem. In geometric terms, there will then always 
be an optimal solution which occurs at a corner of the feasible region. This 
is one of the great computational economies which linear programming 
makes possible. An optimal solution can always be found by examining 
only the corners of the feasible region, and ignoring the infinite set of points 
which make up the remainder of the feasible region. 

In nonlinear programming this result does not hold. This is easily 
shown by counterexample. In Figure 2b the optimum point is the point of 
tangency T between the boundary of the feasible region and the highest 
attainable iso-profit curve. It will be noted that at point 7 both X and Y 
have positive values even though there is only one constraint. 

In Figure 3b an even more extreme case is depicted. Here optimal 


2 More specifically, it is easy to show that, if the variables represent the magnitudes 
of several outputs, then convexity of the feasible region is tantamount to diminishing 
marginal rate of transformation of one output for another (see Value and Capital, 2nd 
edition, Oxford University Press, Inc., New York, 1946, pp. 80-87), for the outer (north- 
east) boundary of the feasible region (arc TN in Figure 52) is the production possibility 
locus (the transformation curve or efficiency frontier), which represents all the maximal 
output combinations producible with the available quantities of scarce resources (see 
Chapter 11, Section 7, below). Convexity requires that as we move to the right along this 
arc its slope diminishes (becomes increasingly negative) or, for some stretches, remains 
unchanged. But the diminishing slope of TN means that we get diminishing returns in 
shifting resources out of the production of Y and into the production of X. Thus, an 
increase in the output of X from X, to Xm results in only a small decrease in Y (from 
the ordinate of U to that of M), but a further equal increase in X from Xm to Xn requires 
a much larger fall in Y. This argument can be extended to the case where there are 
more variables, or variables other than outputs, to show a general connection between 
diminishing returns and convexity of the feasible region. 

Note that, in this respect, diminishing returns is compatible with a linear program- 
ming problem, because the feasible region of such a problem is always convex. Cf. foot- 
note 1, above. 
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points, such as V and W, occur in the interior of the feasible region. It 
is to be noted, then, that even though there is only one constraint both 
X and Y are positive. But, in addition, since the facility represented by 
that constraint is not used to capacity (we are not on the constraint line), 
the slack variable corresponding to that constraint will also have a nonzero 
value. Here, then, we have only one constraint and yet every one of the 
three variables' values is positive! 

It is clear, then, that the so-called basic theorem of linear programming 
need not hold in the presence of nonlinearities. However, the connection 
between the number of variables whose optimal values are nonzero and 

the number of constraints does not just descend into chaos. A very im- 
portant relationship exists between the difference in these two numbers 
and the structure of the problem. In general we may state that with 
diminishing returns (a concave maximization or a convex minimization 
problem) the number of positive variable values will tend to be greater 
than the number of constraints. In the increasing-returns case the number 
of positive optimal variable values will generally fall short of the number of 
constraints. It follows that if we try to approximate a nonlinear problem 
with a linear programming calculation, then if there are diminishing re- 
turns we should suspect that the answer will contain too few positive 
values, while if it involves increasing returns it will contain too many. 

The reason for this result is not difficult to see. First we may note how 
this follows from the geometry. We consider only the maximization prob- 
lem, but the argument for the minimization case is perfectly analogous. 
In Figure 2, the diminishing-returns case, the highest feasible point, that 
is, the optimal point, will tend to occur toward the center of the diagram 
where variables take nonzero values. On the other hand, in the increasing- 
returns case (Figure 4) where the graph curls upward toward the edges of 
the diagram, maximal points like B and C will occur over the axes where 
variables take on zero values (X is zero.at point B and Y is zero at point C). 

But there is an easier way to visualize the relationship between dimin- 
ishing or increasing returns and the number of nonzero variable values in 
an optimal solution. When a linear program indicates that a firm with 
2,000 products and 17 constraints should cut its line down to 17 or fewer 
items, it is reasoned implicitly that the combination of products which 
yields the greatest profits with a small expansion in their outputs will also 


3 It is true that this figure also contains two optimal points Q and FR which lie on the 
boundary (but not on corners) of the feasible region. But it is easy to see that a slight 
modification of the drawing could eliminate these by bending down the three-dimensional 
surface in Figure 3a as it approaches the axes. A simpler example of an optimum point 
which occurs only in the interior of the feasible region would result if the highest point 
M in Figure 2 fell inside the feasible region. 
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continue to be most profitable as their production increases indefinitely. 
It will then pay vo enlarge the output of these most profitable goods as 
far as possible—until they take over all of the firm's scarce facilities and 
leave no excess capacity sufficient for the production of other items. But 
if there are diminishing returns to the production of these goods, then 
though they start off being the most profitable, after their output expands 
to some intermediate level the profitability of a further increment in their 
outputs will fall below that of some other goods and it will then pay to 
devote some of the company’s resources to the output of these other goods, 
and so on. 

That is precisely why our instincts are outraged by the linear program- 
ming recommendation that a 2,000-product firm cut its product line down 
to 17 items, devoting all of its facilities to the production of these goods. 
We surmise that the firm has spread its production over so many different 
items because the market calls for them—because it will be difficult or im- 
possible to market as much of the 17 items as the firm has the capacity to 
produce. There are diminishing returns because of increasing marketing 
costs. Often careful analysis will indicate that the 2,000-item line is in- 
deed excessive, but that the optimal set of products includes considerably 
more than the 17 items which will yield most profits to a small output 
expansion. 

For similar reasons, increasing returns tend to call for considerably 
greater concentration in a few products than does linear programming. If 
an item which is most profitable becomes even more profitable as its out- 
put expands, it will pay to drop other goods from the line to achieve the 
full benefits of specialization. ; 

Here again, an economie example may help to clarify the situation. 
In a study of the optimal number and geographic location of a company's 
warehouses it was found that a linear programming computation might 
very well suggest that the firm operate a separate warehouse for every 
customer! For, in the linear case, if a change in warehouse location would 
reduce the transportation cost of shipping from factory to warehouse to 
some single customer, it would (if these were the only costs involved) pay 
to operate such a warehouse. What this computation ignores is the econo- 
mies in inventory, administration, and bookkeeping, etc., which result from 
the operation of a small number of larger warehouses. In other words the 
increasing returns to size of warehouse operation are ignored in the linear 
programming calculation, which then recommends the operation of too 
many warehouse installations (too many nonzero variables). 

In sum, when a linear programming calculation is employed to deal 
with a problem which really involves diminishing returns, the linear ap- 
proximation will recommend too few activities, whereas if the actual 
vroblem involves economies of large scale (increasing returns) the linear 
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approximation will call for too many activities. This rule should be helpful 
in making rough improvements in the results of linear approximations. 
Even more important, it warns us of the dangers of a linear calculation where 
there are nonlinearities present. 


7. Methods of Nonlinear Computation 


This section makes no attempt to teach the reader step-by-step pro- 
cedures for nonlinear computation. The literature contains many special 
tricks which vary from one type of problem to another, and the field is in a 
state of rapid change and development. We shall discuss only two general 
approaches indicating the logic of the procedures which have been em- 
ployed. It will be noted that the methods which are about to be described 
are both of the myopic variety. They can only be relied upon to find a 
global optimum in & diminishing-returns problem. In a case of increasing 
returns these methods cannot be relied upon to produce more than local 
optima. There exist principles for the determination of global optima in 
the increasing-returns case, but they have as yet not been tested to any 
significant extent. 

Dr. Wolfe has described the simplex procedure of linear programming 
as a “walking” method.‘ That is, the computer steps from one corner to 
another adjacent corner (always in the right direction), until he finds an 
optimum. By contrast he has described the techniques of nonlinear pro- 
gramming as hopping and creeping methods. Let us deal first with the 
former. 

The hopping approach applies only where the constraints are linear, 
even though the objective function is nonlinear. In fact most nonlinear 
programs which have so far been utilized in practice have been of 
this variety. In such a case the boundary of the feasible region is composed 
of a set of joined straight-line segments and is always convex. Indeed, it is 
identical with the feasible region of linear programming. 

In this situation it is nevertheless possible for a unique optimal point 
to occur away from a corner, as witness the case of point T in Figure 2b, 
which is an example of this sort of program. The hopping feature of the 
computational procedure is necessary to get us away from the corner-to- 
corner movement of the simplex method. In effect, starting off at any 
point we walk to the next more profitable corner and then hop back to 
any still more lucrative intermediate point. 

To be more specific, let us describe one of the more frequently used 
hopping procedures in somewhat greater detail. Consider the situation in 
Figure 7. Suppose at some stage of our computation we have just passed 


4 See Philip Wolfe, “Computational Techniques for Non-Linear Programs,” Prince- 
ton University Conference, March 1957. 


(b) 


Figure 7 


corner K and happen to find ourselves at a point A. We then proceed in 
the following stages: 


Step 1: Fit a plane, PP'P"P"", tangent to the nonlinear profit surface 
above point 4.5 


Step 2: Use one "round" of the simplex method on the resulting ap- 
proximative linear programming problem to find a corner, say B, which is 
more profitable than the previous corner K. 


Step 3: Use the differential calculus or some substitute procedure to 
find the most profitable point, say C, on the straight line connecting A 
and B, i.e., find the point C on line AB above which the nonlinear profit 
function is highest. (This is always a “one-dimensional problem,” i.e., the 
graph of alternatives is a straight line, and the computation is therefore 
not difficult.) 


Step 4: Go back to step 1, but start this time from point C and corner B 
(thus going, e.g., successively to corner D and then back to E, etc.). 


These “hopping” methods are slower than the walking methods. More- 
over, unlike the walking methods, they do not actually find the solution 
to a programming problem, but they do approximate it to any desired 
degree of accuracy, which is enough for all practical applications. 

We come now to the “creeping” or, more correctly, the gradient methods 
for solving nonlinear programming problems. Geometrically, they involve 
our sliding around the feasible region, always in continuous motion (no. 
jumps) and always in a direction which goes uphill (downhill) on the 
profit (cost) function. These methods can be used even when both the 
objective function and the constraints contain nonlinearities. Such meth- 


5 This is done by taking the profit function to be total profit = aX + bY and setting 
a = affox and b = af/dY, where f(X, Y) is the true nonlinear profit function and the 
partial derivatives are evaluated at point A’, ie. at X = X4 and Y = Y4. 


153 


154 Nonlinear Programming , Chapter 7 


ods were first proposed for problems relating to programming by George 
Brown and von Neumann. Samuelson later showed that they bear some 
analogy with the way in which a market mechanism can approach its 
equilibrium. The basic idea of the gradient methods is very simple. Suppose 
it is desired to find the outputs which maximize profit, R = f(X1,---, Xn), 
where the X; are the outputs of the firm’s different products. A gradient 
method sets up the differential equation (in which t represents computing 
time elapsed) 


This states that we increase the quantity X; of commodity z(dX;/dt > 0) 
in the trial solution, so long and only so long as this increase in X; results 
in a rise in the firm's profits. Moreover, we make this time rate of increase 
in output, X;, proportionate (equal for an appropriate time unit) to its 
marginal profitability (@R/aX;). In other words, we increase (decrease) 
all quantities whose rise leads to higher (lower) profits and, in effect, 
give a priority ordering to the changes in the different quantities in propor- 
tion to their profit contribution. Moreover, we impose the condition that 
any quantity which falls to zero be stopped at that point 


(aX,/at= 0 if X,— 0 and dR/dX, <0 


(i.e., if X, has just been falling)] so that we do not get into the economic 
nonsense of negative outputs. This is the essence of the gradient methods. 
If the problem is one involving diminishing returns throughout, the solu- 
tion to the gradient-method differential equations will converge, over time, 
to the true maximum, i.e., the quantities, X;, will all approach their 
profit-maximizing values. Gradient methods can also be employed in linear 
programming computations. There is reason to believe that they are slower 
than the simplex method when an accurate solution is required, but there 
is not yet enough computational experience for a firm statement on this 
matter. 

Still another, indeed, so far one of the most successful methods of 
dealing with nonlinear programming problems involves the use of **piece- 
wise linear" approximations. Clearly, a circle can be approximated to any 
desired degree of accuracy by an inscribed polygon. Similarly, other 
well-behaved curves can be approximated by connected straight-line 
segments. In this way the objective function and constraints of a nonlinear 
program can be approximated linearly, and the resulting approximation 
can be handled by variants of the simplex method of linear programming. 
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Kuhn-Tucker Methods* 
8 


1. Kuhn-Tucker Analysis! 


We come now to a class of theorems of nonlinear programming that 
have been the focal point of the mathematician’s interest in the subject. 
These theorems are deeper than the material on programming covered in 
the preceding three chapters. The basic theorems were contributed by 
H. W. Kuhn and A. W. Tucker. 

It will be recalled from Chapter 4 that a maximum or minimum problem 
involving only equality constraints, and which is amenable to treatment by 
standard calculus techniques, can be approached by the classical method of 
Lagrange multipliers. This method utilizes an “artificial” Lagrangian 
problem that is related to the original problem in one important respect: 
Their solutions are the same. In bare outline, construction of this artificial 
substitute problem involves the following steps: 


1. Take each of the constraints and bring all of the terms over to one 
side of the equation—e.g., rewrite X + Y = 5as X + Y —5— 0. 

2. Multiply each of the constraints by a variable whose value is unspeci- 
fied, this variable being the Lagrange multiplier. For example, the preceding 
constraint becomes A(X + Y — 5) = 0. 


* This chapter is somewhat more advanced than the preceding discussions of 
programming analysis. 

1 The following discussion generalizes some of the materials in Appendix A of 
Chapter 6. 
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3. Add together all of the constraints (each multiplied by its own 
Lagrange multiplier) and then add this sum to the objective function. This 
sum is called the Lagrangian expression; e.g., if the problem is to maximize 
X?43XY+Z subject to X + Y — 5 and XZ = 10, the. Lagrangian 
expression is 


X?43XY+Z+.(X+ Y — 5) +A2(XZ — 10), 


where ^, and As are two different Lagrange multipliers. 

We now can state the central theorem of Lagrange multiplier theory, 
which asserts that, in a wide class of problems, any values of X, Y, andZ 
which maximize the value of the objective function subject to the stated 
constraints will also maximize the (unconstrained) Lagrangian expression 
and vice versa. In other words, we are given a choice—we can either solve 
the original constrained maximization problem or we can inste id solve the 
totally different problem of maximizing the value of the Lagrangian ex- 
pression, a problem that involves no constraints. Either procedure auto- 
matically, solves both problems. Naturally, we then choose the alternative 
that is easier, and often the easier procedure will be the Lagrangian method. 
We will see presently that besides its usefulness as a computational device, 
the Lagrangian approach offers us a great deal of additional analytical 
power. 

Professors Kuhn and Tucker have extended this approach to mathe- 
matical programming. We will find it convenient first simply to describe 
their results and the corresponding methods without attempting to justify 
any of the assertions made in the process. It will facilitate the discussion 
to leave explanations to a later section. In sum, the Kuhn-Tucker analysis 


tells us that: 


1. For a wide class of programming problems (including all linear prob- 
lems and all diminishing-returns nonlinear problems) a Lagrangian ex- 
pression can be formed in exactly the way described above for the calculus 
case, and this Lagrangian expression will have the same useful property: 

Whatever values of the variables maximize (minimize) the value of the origi- 
nal objective function subject to its equality or inequality constraints will 
maximize (minimize) the value of the Lagrangian expression (subject only to 
the nonnegativity conditions for the variables). 

This first proposition is, in essence, the main Kuhn-Tucker theorem. 

2. The Lagrangian expression has another interesting property, which 
will be explained further in item 4, below. Suppose, to be concrete, that we 
are dealing with a maximization problem. Then we have the following 
rather curious result: If we treat the Lagrange multipliers as variables, then 
the original problem will have been solved when and only when we have 
found the values of the original problem’s variables which maximize the 
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value of the Lagrangian expression and the values of the \’s which minimize 
that value. In more technical terminology we call this solution a saddle point. 
[See Appendix A to Chapter 6 and Section 5 of Chapter 18, which explain 
this terminology. The graph of the Lagrangian function has a saddle-like 
shape and the optimal point is the top of a hill when looked at from one 
direction, and the trough of a valley when looked at from another (Figure 2 
of Chapter 6).] 

3. In particular, for reasons that will be suggested toward the end of 
the chapter, in a linear programming problem these Lagrange multipliers 
turn out to be the optimal values (the “prices”) of the dual problem.? 
Moreover, given any primal problem and its dual, the Lagrangian expressions 
for the two problems are identical! 

(This theorem is a direct consequence of the duality theorems of 
Chapter 6. It has already been discussed in Appendix A to that chapter and 
will be derived explicitly at the end of Section 2.) For this reason we will 
henceforward use the symbols for our dual structural variables, V1, V5, +-->, 
Vm, to denote the Lagrange multipliers which we have heretofore referred 
to as X1, Day ty Am 

4. In fact, this duality relationship leads directly to the maximization- 
minimization (saddle-point) property described in 2, above. For suppose 
our primal problem is a maximization problem in which the object is to 
maximize profit (I). Then the dual problem must be a minimization 
problem whose objective is to minimize accounting cost (a). Let 
L@Q1,--+,Qn, Vi,+++, Vm) represent the common Lagrangian expression 
for the two problems, and let us designate it in briefer notation as L(Q, V). 
Let Q° represent the optimal set of Q's for the primal problem and let V? 
represent the optimal set of V's for the dual. The Lagrangian approach 
then rests on the following assertions: (a) the value of the primal variables 
Q1,-++, Qn, which maximize II, must also be those which maximize LQ, V°) 
i.e. the Lagrangian expression for the primal problem in which we have set 
w= V}. 

Similarly, the values of the dual variables Vi, --- , Vm which minimize a 
must also minimize L(Q°, V). Hence we have the minimax (saddle-point) 
result: 

If we find a combination of Q's and V's which constitutes a solution to the 
primal and the dual problems, respectively, then for these values the Lagrangian 
expression L(Q, V) will have the lowest value which any V's can give it and the 
highest value which any Q's can give it. 


2 The reader may well wonder whether one can construct a dual problem correspond- 
ing to a given nonlinear primal problem and whether analogous properties hold for the 
nonlinear dual. The answer is that duals can be constructed for wide classes of nonlinear 
problems, and that analogous results do apply. While the nonlinear dual objective func- 
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Before proceeding to discuss the main mode of utilization of the Kuhn- 
Tucker theorem we may describe very briefly several other applications 
of the theorem: 


1. It provides a helpful interpretation (as dual prices) of Lagrange 
multipliers in the linear and portions of the nonlinear programming theory, 
which sheds some light on Lagrange multiplier theory in general. 

2. It is, in many cases, helpful in computation, giving us alternative 
means for solving a given programming problem. In particular, it has been 
helpful in the gradient methods of solution. 

3. Perhaps most important from the point of view of the mathematician, 
the Kuhn-Tucker theorem serves as a so-called existence theorem in 
mathematical programming. That is, there are some programming problems 
which are unsolvable (they have no solution) because they involve incon- 
sistencies or because there is no effective upper bound to the value of the 
objective function (so that the sky is the limit—there is no finite maximum 
value). Before we try to solve a problem it is desirable to know whether a 
solution even exists—because if it does not exist there is no point ir. wasting 
time looking for it. An existence theorem is a criterion which can tell us 
whether a solution to a given problem exists, even if it does not show us how 
to go about finding that solution. The Kuhn-Tucker theorem provides such 
an existence criterion, for it states that, for the class of programming prob- 
lems for which it is valid, a problem has a solution if and only if the corre- 
sponding Lagrangian conditions can be satisfied. 


2. The Form of the Lagrangian Expression 


In the Kuhn-Tucker analysis it is particularly easy to become confused 
about the signs of the terms used in constructing the Lagrangian expression. 
The difficulty arises out of the presence of inequality constraints in which, 
unlike the case of an equation, there is an asymmetry between the transfer 
of terms from the right- to the left-hand side of the relationship and the 
transfer in the opposite direction. Since much confusion can result from an 
error in signs in the formulation of the Lagrangian, before going further let 


tion is somewhat more complicated than the dual’s objective function in the linear case, 
the dual constraints are the obvious nonlinear generalization of the linear dual con- 
straints. In economic terms, the nonlinear dual constraint requires that the marginal 
profit yield of output j be no greater than the accounting value of all the marginal input 
requirements of output j. 

For further references on duality in nonlinear programming see Philip Wolfe, “ 
Duality Theorem for Nonlinear Programming,” Quarterly of Applied M ET 
Vol. 19, 1961, and M. L. Balinski and W. J. Baumol, “The Dual in Nonlinear Programing 
and Its Economic Interpretation,” Review of Economic Studies, Vol. XXXV, July 1968. 
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us go through the rather uninteresting step of describing a standardized 
procedure that is designed to reduce the likelihood of this sort of mistake. 
The procedure is described in the following two rules, which, for the moment, 
are offered with no attempt at explanation, but which will be justified in 
footnote 4. 


Rutz 1. Write each constraint so that all nonzero terms appear on one 
side of the inequality or the equals sign and only a zero appears on the other 
side of the expression. In a maximization problem all inequality constraints 
should be rewritten in the form S > 0, where S is the sum of all nonzero terms, 
while in a minimization problem the inequality should be reversed, i.e., the 
relationship should be written in the form S < 0. 


Example: Suppose a maximization problem contains the constraint Q? + 2Q; > 4. 
This should be rewritten in standard form (by Rule 1) as —4 + Q? + 2Q, > 0. 


RULE 2. To form the Lagrangian, multiply each of the resulting ex- 
pressions, S; by its own Lagrange multiplier, V;, and add the resulting 
product to the original objective function. After all these additions have 
been completed one has obtained the requisite Lagrangian whose maximum 
(or, rather, whose saddle point) remains to be found. 


To summarize, if our problem involves maximization of II — 
f(Qu---,Q.) subject to the constraints gi@Q1,---,Qn)< 0 °°; 
8-(Qi,::-,Q.) € Cm, then the Lagrangian function is LQ, V) = 
f(Q - ++, Qn) + Viler — 91Q1,+ Qn] H H V nlem — gm(Q1,° -:,Q:)]. 
The Kuhn-Tucker theorem then asserts that if we find the Q’s for which 
LQ, V) is a maximum and the V’s for which this function is a minimum, 
subject only to the nonnegativity conditions Q; > 0, V; > 0, we will have 
found the solutions to the original primal and dual problems. 


Example: Consider the program 


maximize T = Qi + QQ: 

subject to, Qı — 10 < —Q; 
Qi + 2Q: 2 4 
Q> 0, Q2 > 0. 


After rewriting our constraints as indicated in Rule 1 we have as our 
Lagrangian expression 


L(Q, V) = Q + QQ: + Vil — Qı — Q3) + Vo(—4 + Qi + 29), 
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which is constrained only by the nonnegativity conditions Q, > 0, Q: > 0 
Vi > 0, V2 > 0. The reader will note that because we are dealing with A 
maximization problem each constraint has been rewritten as a “greater 
than or equal to” relationship. 

We can now apply the rules for the formation of the Lagrangian ex- 
pression to demonstrate our earlier assertion about the relationship be- 
tween the Lagrangians for a pair of dual linear programs. Specifically, we 
will show that L(Q, V°), the Lagrangian for a primal problem, is, term by 
term, identical in form with the Lagrangian for its dual, L(Q°, V), accepting 
for the moment the unproved assertion that the values of the Lagrange 
multipliers for the primal are the V2, the optimal values of the dual struc- 
tural variables, and that the values of the Lagrange multipliers for the 
dual are the Qj, the optimal values of the primal structural variables. 


Proof: In general notation we have as our primal problem 


Max I = D> PQ; 


j=l 


subject to 


D aQ; S cs G=1,---, m). 


j= 


By Rules 1 and 2, this corresponds to the Lagrangian problem 


Max  L(Q, V) = È PA; +È Ve- 22249). 


Similarly, the dual program can be written 
m 
Min a= av; 


subject to 


whose Lagrangian problem is 
Min L(Q',V) = 2 Vici + 2 QP; — D aN). 
i= j= iel 


A glance at both of the preceding Lagrangian expressions indicates 
that each is a special case of the general Lagrangian form 


LQ V) = Z2 P+ 22Ve— 3 Va. 
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The primal Lagrangian problem, then, is to maximize L(Q, V) with V1, 
-++, Vm set, respectively, equal to V$, ---, V2 and the dual problem is to 
minimize the same function with the values for the Q4, - - - , Q, set equal to 
Q9, - - - , Q2. Thus the solution is what is called a saddle point for the general 
function L(Q, V), with that function maximized with respect to the values 
of the Q,’s and minimized with respect to the valués of the V /'s. 


3. The Kuhn-Tucker Conditions. 


The observation that a constrained maximization problem can be 
translated into an unconstrained Lagrangian expression is highly illuminat- 
ing in itself. But we have seen in the discussion of the subject in the context 
of the differential caleulus that many of the uses of this transformation 
depend on the next step, the employment of the first-order conditions for 
the maximization of the Lagrangian expression L(Q;, - - - , Qn, Vs, - - - , Vm). 
That is, in the analysis and solution of a constrained maximum problem in 
the caleulus we make use of the requirement that all of the partial deriva- 
tives of this expression must be equal to zero., i.e., that 


(1) 9L(Q,V)/0Q;- 0 (j—12,--,m 
and 
(2) 9L(Q,V)/2V;— 0 (%=1,2,-+-,m). 


It will be recalled that one normally proceeds to solve simultaneously 
these n +m equations for the optimal values of our n +m variables 
Quete Rn Vy +++, Vw 

It is natural to ask whether a similar strategy can be extended to pro- 
gramming problems—to cases involving inequality constraints. The Kuhn- 
Tucker analysis assures us that, if the relevant functions are differentiable, 
such an extension is indeed possible and shows just how it can be done. 

First, however, we must be careful to recall that conditions of the sort 
we will be discussing are by themselves necessary but not sufficient for a 
set of variable values to constitute a maximum of the Lagrangian and hence 
of the original constrained maximum problem. That is, if our first-order 
conditions are violated, the candidate solution certainly cannot constitute 
a maximum. But even if the first-order conditions are satisfied, it is still 
possible that the proposed solution will not do. We can be sure that the 
variable values under consideration will do the trick only if, in addition to 
the first-order conditions, they satisfy the requirements of the second-order 
conditions. 

Though it is possible to weaken the second-order requirements some- 
what, for our purposes we may take the following to constitute the second- 
order conditions (here it does not matter whether we are dealing with a 
programming problem or one to which the calculus can be applied). 
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Second-Order Conditions: 


a. The set of constraints must define a feasible region which is every- 
where convex. 

b. For a local maximum the original objective function must be con- 
cave in the neighborhood of the maximum point, while for a local minimum 
it must be convex. If these concavity-convexity conditions hold throughout, 
the extreme point will be a global maximum (or minimum). 


The logic of these second-order requirements has already been discussed 
earlier in this chapter. 

We may now turn to the crucial first-order requirements, called the 
Kuhn-Tucker conditions. It will be seen that they are indeed an extension 
of the corresponding requirements of the differential calculus in which the 
first derivatives are set equal to zero. The following are the Kuhn-Tucker 
conditions for a maximization problem—the direction of the inequalities 
simply requiring reversal for a problem of minimization: 


Kuhn-Tucker Maximum Conditions: 


(3) aL(Q VQ; &0 | G-—L2:-5 

(4) QJaL(Q V)/aQd =0 | G— 1,2, 

G) aL(Q,V)/0V; 20 = = 1, 2,+-+, m) 

(6) V4aL(Q, V)/aAV]- 0  G=1,2,--+,m) 
Q; 2 0, V: 2 0. 


The reader will note that the two sets of conditions of the differential 
caleulus case (1) and (2) have been replaced by four sets of conditions, two 
of them involving inequalities. Moreover, it is the inequality Kuhn-Tucker 
conditions (3) and (5) that correspond, respectively, to the differential 
calculus requirements (1) and (2), while the two remaining condition sets 
seem basically new. In the next section we will examine the logic of these 
requirements. But first we must make certain that the reader understands 
their construction. 

For this purpose we return to the illustrative Lagrangian function of 
the preceding section: 


L(Q, V) = Qi + QQ + yao — Qi — Q3) + Vo(—4 + Qi + 2Q;). 
The reader may verify directly by differentiating partially that the corre- 
sponding Kuhn-Tucker conditions are 


Pam = 8Qi + Q — Vi+ 2V2Q <0 


(3a) aL /8Q2 = Qı — 5V: Qi + 2V2 <0 
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(42) ee = Q(3Qi + Q — Vi+ 2V2Q,) = 0 
Q2dL/8Q2 = Q2(Q: — 5V: Qİ + 2V2) = 0 
co ae eee 
i aL/aV2 = —4 + Q} + 2Q; > 0 
age = Vi(10 — Q- Q3) = 0 
V«(8L/8V;) = V2(—4 + Qi + 2Q:) = 0 


Q 20Q,20Y,20,V,20 


(62) 


It will be observed that these have been grouped and numbered to corre- 
spond, respectively, to the general Kuhn-Tucker conditions (3), (4), (5), 
and (6). 


4, Rationale of the Kuhn-Tucker Conditions 


As we will see now, the Kuhn-Tucker conditions are the natural gen- 
eralization of the caleulus requirements (1) and (2) to take account of the 
possibility that the maximum or minimum in question can occur at a 
boundary point (a corner solution) rather than at an interior point (see 
Chapter 3, Section 10). The calculus requirements are generally appropriate 
only if the extremum (i.e., the maximum or minimum) oceurs at a point at 
which all of the variables (including slacks) take nonzero values—at a 
point not on any of the boundaries of the feasible region. The Kuhn- 
Tucker conditions, as we shall see now, apply to either case. The role of the 
novel requirements (4) and (6) is, in effect, to determine whether the in- 
terior or corner maximum rules apply. Naturally, this issue arises only 
because the variables are constrained to take nonnegative values, since otherwise 
Q; = 0 would no longer be a boundary point, 

An intuitive view of the matter is simple enough. Consider the maximiza- 
tion of y = f(z1, 22, * * * , Zp) subject to all z; > 0. Suppose, first, that we are 
at a point at which the value of zı can either be increased or decreased (an 
interior point, as far as this variable is concerned). Then (point A in Figure 
1) by the usual logic of the marginal analysis we must have 0y/0z; = 0, 
for otherwise either a rise or a fall in the value of x; could increase the value 
of y, and so y could not be at its maximum. 

On the other hand, suppose we are testing for the possibility of a corner 
maximum at which z; = 0. Here, of the three possibilities 9y/8z; = 0, 
ðy/ðzı < 0, and ðy/ðzı > 0, we can only rule out the last. That is, if 
ðy/ðzı > 0, then one can increase the value of y by raising z; above the 
corner value zı = 0 so that zı = 0 cannot possibly be the coordinate of a 
maximum point. However, if 0y/óz,.— 0, the point with zı = 0 (point B 


CORNER MAXIMUM 
(ay 78x, = O) 
INTERIOR 


MAXIMUM 
(3y/àx, = 0) 
A 


CORNER MAXIMUM 
(3y/àx,«0) 


Figure 1 


in Figure 1) may be a maximum for the usual reasons, while if 8y/óz; < 0 
it may be a maximum point simply because it is impossible to reduce the 
value of x, any further (point C in Figure 1). We conclude 


Rutz 3. Given a differentiable function y = f(z1, * * * , tn) 
(a) For an interior maximum it is necessary that ðy/ðz; = 0 (i = 
1,2, =+- , n). 


(b) For a corner maximum it is necessary that ðy/ðz; € 0. 
(c) For a corner minimum it is necessary that ðy/ðz; > 0. 


The reader should check for himself that part (c) of Rule 3 follows directly 
by the same reasoning that explains the use of the reversed inequality for 
the maximization case. 

We observe next that (4) and (6), the two completely novel requirements 
among the Kuhn-Tucker conditions, serve essentially to determine which 
of the two regimes applies: whether the corner or the interior maximum con- 
ditions are pertinent. Consider first the condition 


@) Q;[aL@Q, V)/3Q = 0. 


This can be translated into the equivalent requirement 


(4’) either Q;— 0 or  29L/3Q;— 0 (or both). 


The implication should be clear. If the value of Q; under consideration is 
nonzero (interior maximum case) then (4) requires 0L/8Q; = 0, the condi- 
tion which we have seen to be necessary in this case. However, if we are at 
a corner where Q; = 0, then (4) or, more obviously, (4^) tells us that 95/9Q; 
may or may not be equal to zerc. However, now we see by Kuhn-Tuckér 
. Condition (3) that we then must have dL/aQ; < 0 which by Rule 8, i 

the requirement for a corner maximum. E ES diis 
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Similarly, (6) can be translated into 
(6^) either V; — 0 or  2O8L/3V;— 0 (or both).? 


This tells us, analogously with the interpretation of (4^) that for 
V;~ 0 (an interior point), we must have 9L/83V; = 0, while for V; = 0 
(a corner), comparison with (5) tells us that 6L/dV; > 0. Here the in- 
equality is reversed from that in (3) because, it will be recalled, we seek a 
saddle point of L(Q, V), i.e., we are looking for values of Q and V at which 
LQ, V) is at a maximum with respect to the Q and is at a minimum with 
respect to the V. Thus, by Rules 3b and 3c we require 8L/8Q; < 0 and 
9L/8V; = 0 just as is specified by Kuhn-Tucker conditions (3) and (5). 

To summarize, the Kuhn-Tucker conditions are a natural generalization 
of the caleulus requirements for an extremum. While the latter apply only 
to interior maxima, the former adapt themselves automatically to corner or 
to interior maxima as the cireumstances require. The Kuhn-Tucker ap- 
proach, then, permits us to treat a programming problem in two steps. 
First, by employing a Lagrangian expression for the original constrained 
maximization problem, we substitute a saddle-point problem involving no 
constraints other than the nonnegativity requirements. Second, by use of 
the Kuhn-Tucker conditions we are able to choose among both interior and. 
corner points and thereby to arrive at the true extremum, provided that 
the second-order conditions are satisfied. 


PROBLEMS 


Write the Lagrangian expressions and the Kuhn-Tucker conditions for the 
following problems. 


1. Maximize H = 7Q; — 2Q:Q2 + Q} 


subject to Qi + Q: < 400 
Q:Q2 > 200. 


2. Minimize the value of II in the preceding problem subject to the same constraints. 
3. Minimize II = 69,9; 


subject to 201 + Q: > 50 
Qı < 10. 


5, Interpretation of the Kuhn-Tucker Conditions 


Some additional observations will shed further light on the economics of 
the Kuhn-Tucker conditions and on the nature of their workings. It will be 


3 The reader may recognize a resemblance between the explanation offered for (4) 
and (6) and the duality theorems U;V; = 9 and L,Q; = 0 which were described in Section 
3 of Chapter 6. The similarity is no coincidence as will be shown in the next section. 
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recalled that (by bringing all terms over to one side) the typical constraint 
in our problem may be written as 


(7) 0 < c; — 9iQi,---, Qa). 


The Lagrangian function is obtained by multiplying the right-hand side 
of this expression by its Lagrange multiplier, V;, and adding it (and all 
other such constraint expressions) to the original objective function 
II = f@Qi,---,@,). The explicit form of the Lagrangian function is, there- 
fore, 


(8) LQ, V) —f(QQ +++ Qn) + X: Vile: — 9s(Q1,---, QI. 


We can now see at once the meaning of the Kuhn-Tucker condition (5), 
O0L/8V; 2 0, for by direct differentiation of (8) this becomes 


(9) 9L/9V; = ci — gi(Q1,---, Qn) = 0, 
which is identical with our original constraint (7). 


In other words, Kuhn-Tucker condition (5) is simply the ith constraint 
equation itself, in somewhat disguised form. That is how the Kuhn-Tucker 
conditions guarantee that in solving the Lagrangian problem the con- 
straints of the original problem are satisfied, even though no explicit con- 
straints appear in the Lagrangian formulation of a problem. This remark 
suggests how the Lagrangian approach permits one to substitute for a 
constrained maximum problem one involving no constraints. The La- 
grangian expression is simply constructed in such a way that in the process 
of maximizing the latter, as a consequence of the requirement dL/dV; > 0 
for all z, the original constraints are automatically satisfied.* 


4 We can now finally explain the conventions about the construction of the La- 
grangian which are embodied in Rules 1 and 2 of Section 2, above. The point in these 
rules is to construct a Lagrangian expression having the properties we desire. Specifi- 
cally, we want L(Q, V) to have a form that guarantees that the Kuhn-Tucker condition 
aL/aV; > 0 is equivalent to the ith primal constraint, and that the Kuhn-Tucker con- 
dition aL/aQ; < 0 is the same as the jth dual constraint (as discussed later in this section). 

To see how this is accomplished, recall that by Rule 1 one rewrites the ith primal con- 
straint as S; > 0, where S; is the sum of all nonzero terms in the constraint. This sum, 
Ss, is then multiplied by +V; (Rule 2) to yield the term V;S;. Thus, the Kuhn-Tucker 
requirement (5) aL/aV; > 0 gives us, by direct differentiation, S; > 0, our original con- 
straint, as we desired. Now suppose that instead of following Rule 1, we had brought all 
the terms in the original constraint over to the other side of the inequality to yield 
=S; < 0. Multiplying —S; by V; we obtain the Lagrangian term — V;S;, so that Kuhn- 
Tucker condition (5) now becomes —S; > 0, which is clearly not the original constraint. 
Thus, if we bring the terms of the constraint over to the “wrong” side in violation of 


Bud M the method will not work because the original constraints will not, generally, be 
Bal ied. 
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There is an alternative interpretation of (5) which is also important and 
which immediately offers us an economic translation of Kuhn-Tucker 
condition 6. For this purpose recall that the ith primal slack variable, U; 
(unused capacity), is defined by a rewriting of the primal constraint as 


nue, Qa) +Us=c or U= c g a Qn)- 


Comparison with (9) shows at once that our Kuhn-Tucker condition (5) 
can now be rewritten simply as 


aL/aV; = U; > 0. 
In other words, the derivative of the Lagrangian with respect to V;, the ith 
Lagrange multiplier, is just U;, the ith slack variable, and so Kuhn-Tucker 
condition (5) amounts simply to the requirement that the values of the 
slack variabies be nonnegative! 
Moreover, making the substitution @L/aV; = U; into Kuhn-Tucker 
condition (6) we obtain 


V;oL/9V; = V: U; = 0. 


This is nothing more or less than the familiar duality theorem which states 
that either U;, the unused capacity of resource i, equals zero, or that V ;, the 
marginal valuation of that resource, is zero (or both). There is thus a very 
good explanation for the structural resemblance between the implications 
of the duality requirement U;V; = 0 and those which we obtained from 
requirement (6) in the preceding section. 

We may now quickly offer analogous interpretations of Kuhn-Tucker 
conditions (4) and (5). In the dual of our programming problem let T; 
represent the dual slack variable in the jth dual constraint. It will be re- 
called from Chapter 6 that this can be interpreted as the opportunity loss 
incurred by the production of a unit of output j.5 

Then, by direct analogy with the preceding discussion of Kuhn-Tucker 
conditions (5) and (6), it can be shown that 

(a) The jth Kuhn-Tucker condition (3) is equivalent to the jth con- 
straint of the dual problem. 


(b) Condition (3) may be rewritten as 
—eL/0Q; = T; 2 0 (je 1,***, 0), 


i.e., it is the nonnegativity requirement for the dual slack variable and the 
partial derivative of L with respect to Q; is simply minus one, times the 
slack variable T';. 


E We ae longer use L; to denote the value of the dual slack variable, to avoid con- 
fro with L(Q, V), which is employed in this chapter to represent the Lagrangian 
unction. 
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(c) Condition (4) may be rewritten as 
TQ;—0 (j= 1,---,n), 
which is the standard proposition of duality theory stating that output j 


Should not be produced (Q; — 0) unless it incurs no opportunity loss 
(T; = 0), i.e., it states that Q; = 0 or T; = 0 (or both). 


6. Why It Works 


We are now in a position to offer a heuristic explanation of the Kuhn- 
Tucker theorem. That is, we can now grasp why a set of values of the Q's 
and V's that solve the Lagrangian problem must also be solutions for the 
primal and dual problems, respectively. In particular, we can see why the 
optimal values of the V's, the dual structural variables, can serve as the 
values of the corresponding Lagrange multipliers, i.e., why we have V? =), ` 
where V? is the optimal value of V;. 

Note that we are not asserting that the original programming problem 
and the Lagrangian problem are equivalent. Neither problem is simply a 
rearranged version of the other. The analysis states only that these two 
problems happen to yield the same answer. That is, we will have proved that 
we have found the correct Lagrangian if we can show that its solution is also a 
solution to the/original primal and dual programming problems. In sum, to 
prove that the Lagrangian expression is correct we need only show that 
“it works"—that it yields the desired solution to the original problem. 

For this purpose we show now how the Kuhn-Tucker conditions for 
the Lagrangian assure that any solution for the Lagrangian problem must 
automatically accord with the constraints and the objective function of the 
original problem. 


A. The original constraints: We have just seen that the Kuhn-Tucker 
conditions (3) and (5) (together with the nonnegativity requirements 
; 2 0, V, > 0) are equivalent to the constraints of the original primal and 
dual programs. Hence, since the solution of the Lagrangian problem must 
satisfy Kuhn-Tucker conditions (3) and (5), they must satisfy the original 
primal and dual constraints as well. 


B. The objective function: The objective function of the Lagrangian 
problem is 


LQ, V) = 1+ E Vile: — gis «++, Q3]. 
But Kuhn-Tucker condition (6) asserts that for any i 
V;9L/9V; = Vie; — g:(Q ---, Q,)] = 0. 
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Hence, in any solution that satisfies the Kuhn-Tucker conditions, all of 
the last terms in the preceding expression for L(Q, V) must drop out and 
the objective function of the Lagrangian problem becomes simply 


(10) L(Q?, v*) = n, 


which is the value of the objective function of the original (primal) problem. 
In exactly the same manner we can show that these values of Q and V 
must yield 


(11) L@, V) =a 


so that the Lagrangian then takes the value of the dual objective function. 
Thus, to summarize, we have shown in intuitive terms that a solution that 

-satisfies the Kuhn-Tucker conditions for the Lagrangian (and the non- 
negativity conditions for the variables) must satisfy the constraints for the 
original problems and must equate the value of the Lagrangian to that of 
the objective function of the primal (the dual). With the original objective 
functions coinciding with the Lagrangian expression and all the original 
constraints satisfied, it need no longer be surprising that the Q and V 
which satisfy the Lagrangian problem as constructed are also the solutions 
to the original primal problem and its dual. 

As a matter of fact a simple argument now shows rigorously that a 
solution of the Lagrangian problem must satisfy the original primal and 
dual, for we have just shown that for Q = Q° and V = V? satisfying the 
Kuhn-Tucker conditions for the Lagrangian, these vaiues must satisfy the 
primal and dual constraints and so must be feasible. We have, in addition, 
by (10) and (11), 


a — L@Q°, V?) = m. 


But, it will be recalled, the duality theorems of Chapter 6 tell us that this 
is a necessary and sufficient condition for optimality of the solution. For 
any pair of teasible solutions to the primal and dual that yield o = m must 
also be optimal solutions to those problems. Hence, Q° and V? must be the 
optimal solutions for the original primal and dual problems. 

We have thus shown that the Lagrangian function as described by Rules 
1 and 2 is constructed correctly—it “works” in the sense that its solution 
is also the solution to the original primal and dual problems. Incidentally, 
this also provides the rationale for our use of the dual variable values v? 
as the Lagrange multipliers for the primal problem and the use of the Q} 
as the Lagrange multipliers for the dual. For the Lagrangian function that 
we have just shown to “work” uses the V? and the Qj in just that way. 
Thus this procedure is correct because it yields the correct solution to the 
original problems, which is all we require of it. 
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7. Theoretical Applications of the Kuhn-Tucker Conditions 


The Kuhn-Tucker conditions can be helpful in the solution of specific 
numerical problems. But economists have used them primarily to deal 
with more general qualitative problems. That is, the conditions can be used 
to derive general conclusions about the nature of the solutions, even where 
programming problems involve rather general functions the values of whose 
parameters are not specified. Thus, while the simplex method can handle 
only a numerical objective function such as x = 3Q, + 5Q., the Kuhn- 
Tucker analysis can work with more general objective functions such as 
x = Qı + a:Q:, or even r = f(Qi, Q:). As a result, the Kuhn-Tucker 
conditions may perhaps constitute the most powerful single weapon pro- 
vided to economic theory by mathematical programming. There is no 
known general expression for the solution of a general programming prob- 
lem. Unlike a differential calculus problem, in a programming computation 
we can only find the solution of a specific program all of whose parameters 
are known, determining this solution through a process of successive numeri- 
cal approximations. It is therefore a manifestation of the very great power 
of the Kuhn-Tucker analysis that it does permit us to arrive at general 
qualitative conclusions about the behavior of the solutions to nonnumerical 
problems. A few examples will show precisely how this is accomplished. 


Example 1: Inelastic demands. Evidence, which is perhaps superficial, suggests 
that some of the commodities or services produced by multiproduct “natural” 
monopolies face markets whose demands are quite inelastic, at least throughout 
what the businessman considers the relevant range of prices. It is well known that 
the presence of such items causes problems for the standard theory of profit maxi- 
mization. This is so because such a commodity always yields a negative marginal 
revenue.® Therefore the firm can increase its profits by raising its price, thus 
reducing its sales volume. For this increases its total revenue, and, since the 
company now sells less, it simultaneously reduces its total cost. Let us examine the 
equilibrium of the firm in such circumstances. In the following discussion we 
suppose for simplicity that only two items are sold by the firm. Marginal costs are 
assumed to be positive throughout. 

Let Qı and Qe be the quantities sold of the two items, 


P1(Q1), P2(Q2) be their prices, 

MR, -0P1Q1/9Qi, MR2 = 0P2Q2/9Q» be their marginal revenues, 
C(Q1, Q2) be the total cost function, and 

MO, MC» be the commodities’ respective marginal costs. 


Pd d 
* Proof: Elasticity of demand is defined as E = — 9 2 and for marginal revenue we 
have the expression dPQ 
Mke ia s «ak ja 
= 30 P+0% p(1 +25 -»( ij) 


Thus, if demand is elastic (E > 1), then MR > 0, and if demand is inelastic, MR < 0. 


172 Kuhn-Tucker Methods Chapter 8 


We derive 

Proposition 1: 'The profit-maximizing firm, one of whose products has 
an inelastic demand, will always end up at a corner maximum at which Q1, 
the quantity of that item sold, is (virtually) zero or is as small as regulatory 
stipulations permit. 


Proof: Our firm's problem is to maximize profits, 


II = P,Q, + P2Q2 — CQ:, Q2); 
subject to the minimum output requirements, 


0<m <Q; 0 < m; € Qs. 
Our Lagrangian becomes 


L(Q, V) = PiQi + P2Q2 — C(Qi, Q) + Vi(Qy — mi) + V2(Q2 — m). 


Since there are two structural variables and two Lagrange multipliers, this 
problem yields eight Kuhn-Twcker conditions. However, only four of them 
explicitly involve Q:, the variable in which we are interested. Thus, the 
Kuhn-Tucker conditions that are relevant for our purposes are 


(a)  9L/0Q = MR, — MC, + Vi «0 

(b) Q3L/2Q; = Q(MR, -- MC, + Y) = 0 
() aL/AVi=Q—m>0 

(d) VidL/aV, = Vi(Q: — m) = 0. 


By condition (c) form, > 0 we must haveQ; > 0. Therefore, by (b) we 
must have MR, — MC; + V1 = 0. We know (footnote 6) that a com- 
modity whose demand is inelastic has a negative marginal revenue, i.e., 
that MR, < 0. Thus, since Vı = MC, — MR, this means V, > 0. There- 
fore, by (d) it follows that Q, = m, that is, Q4 will set at its minimum 
permissible level, as was to be shown. 

Notice the basic procedural trick that recurs again and again in this 
sort of application. We usually focus initially on one of the Kuhn-Tucker 
equations, (4) or (6), rather than the inequalities, (3) or (5). Thus, taking a 
Kuhn-Tucker condition such as Q;9L/0Q; = 0, we try to verify one of 
the two alternatives such a condition permits. If we happen to know 
9L/0Q; > 0, then we deduce Q, = 0. Alternatively, if we know Q, > 0, we 
have instead of the inequality, 9L/dQ; > 0 as specified by condition (3), the 
equation ðL/ðQ; = 0. This equation can be used to help us to solve for the 
value of Q; and the other variables. The same device, it will be noted, is 
also used in deriving the economically more important results that follow. 
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Example 2: Peak load pricing. Suppose that the demands of a perfectly com- 
petitive firm vary by hour of day so that in some hours its capacity is fully utilized 
(peak periods) while at other times demand is slow so some capacity remains under- 
utilized (off-peak periods). Then we have: 

Proposition 2: The profit-maximizing outputs will be such that prices at 
off-peak periods will merely cover marginal operating costs (raw materials, 
labor, etc.) while in peak periods the prices will exceed marginal operating 
costs. The sum of the excesses of these prices over marginal operating costs 
for all peak periods will just add up to marginal capital cost, i.e., they will 
sum to the marginal cost of increasing capacity. 


Notation: Let 
Qu Qo, - Q2. 
represent quantity demanded during each of the 24 hours of the day 
P3, Po, +++, P24 be the corresponding prices, 
y be the hourly output capacity, 
C(Qi, +++, Q24) be the daily total operating cost, and 
g(y) be the daily cost of capital (capacity). 
We assume that all Q; > 0, i.e., that some output is sold during each hour 


of the day. Total profit per day will obviously be 


F P,Q, — CQ, 0:9 — 90), 


i=l 


which the firm maximizes subject to the twenty-four capacity constraints: 


Our Lagrangian function becomes 


Ih = 2 PQ: — CQ:,---,Q24) — gly) -- dily — Q2. 


Since competition is assumed to be perfect, prices are not affected by the 
firm's outputs, i.e., 2P;/8Q; = 0. Our Kuhn-Tucker conditions are then 


oI, ac P 
——p.———— « = wee 
(a) IQ; P; aQ; WS 0 G 1, ) 24) 
(b) 9,23 -=o G=1,---, 24) 
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(o) T E+ EN SO 

(d) m - 

(e) bes —Q;20 (f 60, 24) 
(f) M TMU- Q) =0 (j= 1,--+, 24). 


Since we have assumed all Q; > 0, it follows from (e) that y > 0 (that is, if 
capacity, y, were zero, nothing could be produced). 

Now that we know that all Q; and y are positive, by (b) and (d) we have 
0I1,/dQ; = 0 and 9IDb/8y = 0, that is, (a) and (c) become the equations 


oI, aC 


(a^) um 4-4 à" 
, NL € 
(e^) ie nt ZA, = 0. 


Furthermore, since in any off-peak period, t, there is, by definition, excess 
capacity, we must have y > Q,, that is, by (f), for Buch a period we must 


have 
A = 0. 


- Hence by (a’) the first part of our theorem follows at once: For any off-peak 
period, t, 


that is, for any off-peak period, price will optimally be equal to marginal 
operating cost, 9C /8Q.. 
For any peak period, s, we may, however, have \, > 0 in which case, 


by (a’), price will exceed its marginal operating cost by a supplementary 
amount equal to Az, i.e., 


P, = 0C/9Q, + X.. 
Moreover, by (c’) the sum of these supplements for all peak periods together 
will be exactly equal to the marginal capacity cost, dg/dy, that is, 


SÉ 


8y 
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This completes the derivation of Proposition 2. 

There is a public policy analogue of this theorem.’ It states that in off- 
peak periods, since there is excess capacity, demand should be encouraged 
by charging a price as low as possible without incurring a loss on the margi- 
nal unit sold, i.e., by charging a price that covers only the marginal operat- 
ing cost, 9C/3Q,. However, since peak period demand presses on capacity, 
any increase in this demand must require additional capital, and it must 
therefore cover its marginal capital cost, dg/dy. These, then, are the basic 
principles of optimal peak-hour and off-peak pricing, i.e., the principles rele- 
vant for the setting of daytime and evening electricity rates, telephone rates, 
etc. It suggests, for example, that there may be something irrational about 
the common practice on trains, toll bridges, etc., under which discount 
tickets are available to commuters who travel, typically, at the peak hours. 
For the commuter ticket tends to encourage utilization of these facilities 
at the most crowded period, an influence whose desirability may well be 
questionable. 


PROBLEMS 


1. Suppose that instead of seeking to maximize profits the firm desires to maximize 
its physical sales volume Qi + Q2 subject to the rate of return (relative profit) 
constraint II > (Q1-+ Q2)S, i.e., P1Q1 + P2Q2 — C(Q1, Q2) > (Q1+Q2)S, where 
S is a constant representing the minimum acceptable rate of profit, II/(Q: + Q2). 
If MR; < 0, MRe > 0, and both marginal costs are equal (MC; = MC»), 
prove that a maximum requires II = (Qı + Q2)S and Qi = 0. 


2. What are the irnplications of our marginal revenue assumptions for the elasticities 
of demand of the two products? Explain in economic terms, therefore, why no 
solution with Qi > 0 can be expected to be optimal. 

3. Suppose the firm produces a single product whose output is Q and that its sales 
are affected by its advertising expenditure, A. If the firm is trying to maximize 
its total revenue, R(Q, A), subject to a profit constraint 1 = RQ, A) — C(Q) — 
A 2 m, where the marginal revenue of advertising and the marginal cost of out- 
put are both positive (3R/aA > 0, aC/aQ > 0), prove, assuming Q > 0 in the 
solution, that we must have II = m, aR/aQ > 0 and ar1/aQ < 0. 

4. What is the economic interpretation of the results obtained in the solution to 
Problem 3? (Hint: What do they imply about the relative magnitudes of the 
profit-maximizing output, the unconstrained revenue-maximizing output, and 
the constrained revenue-maximizing output?) 


7 The theorem was derived with the aid of Kuhn-Tucker analysis in S. C. Littlechild, 
“Peak-Load Pricing of Telephone Calls,” The Bell Journal of Economics and Management 
Science, Autumn 1970, pp. 191-210. The result was previously formulated by Steiner, 
Williamson, and others; see P. O. Steiner, “Peak Loads and Efficient Pricing," Quarterly 
Journal of Economics, March 1964, pp. 54, 64-76; and O. E. Williamson, “Peak-Load 
Pricing and Optimal Capacity Under Indivisibility Constraints," American Economic 
Review, September 1966, pp. 56, 810-27. Reprinted as “Peak-Load Pricing," in R. Turvey, 
ed., Public Enterprise, Penguin Books, Inc., Baltimore, 1968, pp. 64-85. 
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5. (Proposition due to Averch and Johnson?) Suppose that a profit-maximizing 
firm has a single output and two inputs, capital and labor, whose respective 
quantities are Q, z1, and z2 so that its profit function is II = PQ — ciz1 — cate, 
where cı > 0 and c2 > 0 are the respective unit costs of capital and labor. If the 
firm's profit is limited to some fixed proportion of its capital so that II < Kz1. 
K > 0, and z, > 0, z; > 0, prove that E TS cannot equal ci/c» as would be 
the case in the absence of the constraint, i.e., the regulatory constraint will distort 
the relative proportions of capital and labor used by the firm. Assume 0 < à < 1, 
where ^ is the Lagrange multiplier. (This can be proved, on the assumption 
K > 0 and the premise that the constraint effectively reduces the firm's total 
profits.) . 

6. Discuss the significance of the Averch-Johnson result (Problem 5) for regulatory 
policy relating to public utilities. 
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1, Demand Curves 


The demand curve is among those devices of economic theory which 
have found frequent employment in applied economics. In its traditional 
form, it sums up the response of consumer demand to alternative prices 
of a product—it can tell management what may be expected to happen to 
the demand for one of its outputs if the price of that item is changed. 

This information is summarized in a graph (the demand curve itself) 
which shows how much will be demanded at every possible (hypothetical) 
price over the relevant range (Figure 1). For example, point Do on the 
demand curve DD’ indicates that at price OP» the consumer, or group of 
consumers, for whom the curve is drawn will wish to purchase OX , units 
of the product. 

Several features of the demand curve should be noted: 


1. It is customary to represent the price level on the vertical axis and 
the quantity demanded on the horizontal axis.! 


1 This arbitrary convention would seem to be an inappropriate arrangement because 
in the present discussion we treat quantity demanded as the dependent variable and 
price as the independent variable. The origin of this practice is that this curve together 
with a supply curve has traditionally been used in the analysis of price determination 
in a competitive industry as described in Chapter 16, Section 3. However, even here the 
price cannot be considered a dependent variable since, in the supply-demand analysis, 
price and quantity are determined simultaneously. 
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2. The graph may refer to the demand of an individual consumer, or it 
may describe the aggregated demands of a group of consumers constituting 
a market. The market demand curve is obtained from the demand curves 
of the individuals composing it by adding up, for each price in turn, the 
quantities all the iudividual consumers demand at that price. That is, one 
adds the individual demand curves horizontally to obtain the market 
demand curves. 

3. The demand curve assumes that there is no change in the values of 
other pertinent variables. Specifically, prices of other goods and consumer 
incomes are among the other things that are assumed to remain equal. 

4: The graph depicts the situation at a single point in time, say 4:33 p.m. 
on June 12. Hence, all but one of the prices and quantities must be hypo- 
thetical—the curve must generally answer the “iffy” question: “If price 
were OP, how much would this (these) consumer(s) buy?” 

5. The curve is generally assumed to have a negative slope. In economic 
terms, this is the plausible assertion that, other things being equal, more 
of the commodity would be demanded (OX, rather than OX) if the price - 
were lower (OP; instead of OP}. However, two exceptions should be men- 
tioned: Cases of snob appes! and cases where consumers judge quality by 
price. Commodities like expeusive jewelry may be purchased precisely 
because their price is high, &nd a fall in their price might reduce their snob 


appeal and therefore, perhaps, their sale (although enough poorer con- 
sumers might then be induced to buy the items to make up for the loss of 
affluent customers). Similarly, when consumers have no ability to judge 
the quality of a good directly and use price as an indicator of quality, as 
they probably often do, & reduction in its price may cut into the demand 
for a good. The negative slope of the usual demand curve will be discussed 
again later in this chapter, along with another exceptional case that has 
been discussed widely in the theoretical literature. 
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It is important to recognize the isofemporal character of the demand 
curve—the fact that every one of its points represents one of the hypo- 
thetical possibilities available for some given moment. This property is a 
common feature of all the relationships used in static analysis, i.e., of 
virtually all the relationships in this book. This characteristic is troublesome 
both for empirical application and for comprehensibility of economic 
analysis by laymen. Yet, this attribute of our static relationships is not 
imposed as a mere caprice or as a means to introduce analytical complexity 
or elegance. 

Rather, we use relationships that are isotemporal because most of our 
analysis concerns itself one way or another with optimal choice. Behavioral 
analysis in economic theory usually proceeds on the assumption that the 
decision-maker always arrives at decisions that are optimal in terms of his 
objectives; welfare analysis seeks to determine what decisions are optimal 
from the point of view of the public interest; etc. But we saw in Chapter 1 
that, by definition, optimization analysis consists of the explicit or implicit 
comparison of the financial consequences of the choices available to the 
decision-maker. The need for isotemporal relationships follows at once. 
Suppose a seller is considering the selection of a price for 4:33 p.m. on 
June 12 and the price possibilities under consideration are $12.99, $14.99, 
and $16.99. Obviously, the relevant consideration for the decision-maker is 
how much he will sell if he selects the first of these possibilities, how much 
he will sell instead if he selects the intermediate-price candidate, etc. But 
that is precisely the information that an isotemporal demand curve gives— 
it tells us how much would be sold ¿f some one or another of the possible 
prices is the one that is selected for the moment to which the decision pertains. 
In sum, since optimization requires explicit comparison of the possibilities 
available at the time to which the decision applies, the relevant relation- 
ships (curves) must describe (and contrast) just those hypothetical 
possibilities for that time period. 


2. Shifting Demand Curves: Demand Functions 


Since the demand curve is defined to pertain only to a particular time, 
its shape and position are likely tochange with the passage of time. At one 
moment DD’ is the relevant demand curve, but at another instant the 
curve has the shape HE’. Such a change is described as a shift in the 
demand curve. This is contrasted with a movement along a demand curve, 
say from point Do to Di. 

A shift in a demand curve is normally accounted for by a change in the 
value of some of the variables which affect demand. For example, a rise in 


D 
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consumer income can lead to an upward shift in the demand curve from 
DD’ to EE'. This means that at any given price, such as OP, the con- 
sumer(s) will demand more than before the shift. It should be noted, 
however, that if price happens to rise sufficiently at the same time, say to 
OP, consumers may end up buying less despite an outward shift in the 
demand curve. In such a case, the shift in demand is accompanied by an 
offsetting movement along the curve. 

Besides income, many other variables can affect the position of the 
curve. À change in the amount of advertising, a change in price or quality, 
or the advertising approach of a competing product—even a change in the 
weather—can shift a demand curve. Some of the relevant variables may 
even be intangible and unquantifiable—for example, a change in consumer 
tastes can cause a shift in a demand curve—although we may prefer to go 
behind this phenomenon and seek the variables which account for the 
taste change. 

'To summarize, demand is a function of many variables such as price, 
advertising, and decisions relating to competing and complementary 
products. The relationship which describes this entire many-variable inter- 
connection is called the demand function. By contrast, the demand curve 
deals only with two of these variables, price and quantity demanded, and 
ignores the others, or, rather, assumes that their values are held constant. 
Indeed, the distinction between a movement along and a shift in a demand 
curve may be described in terms of the variables involved. Any change in 
quantity demanded which results only from a variation in price is a move- 
ment along the curve, whereas change in the value of any other variable in 
the demand function is likely to shift the demand curve. 

Several concluding observations about the distinction between shifts in 
and movements along the demand curve are relevant: 


1. Phrases such as “a rise in demand" are ambiguous and should be 
avoided, since it is not clear whether they refer to a shift in or a movement 
along the curve. 

2. The distinction is often important for applied economics. For 
example, the statement that a reduction in dernand is deflationary is valid 
only if it refers to a downward shift in the demand curve, since a leftward 
movement (a decrease in quantity demanded) along a negatively sloping 
demand curve must, by definition, be concurrent with a rise in price. One 
can find cases where this has been misunderstood by legislators who there- 
upon have made nonsensical statements on inflation policy. Similarly, the 
sort of increase in demand which is most eagerly hoped for in a business 
firm will involve a shift in the demand curves for its products. In fact, it 
will normally result from autonomous changes in the values of the variables 
which are entirely outside the management's control. There is usually 
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some cost to the firm when the quantity of its products demanded increases 
as a result of a change in the firm’s advertising expenditure, or in the 
incidental services which it provides to its customers, or in some other of its 
demand curve-raising activities. But an upward shift in demand which 
occurs because of a rise in national income or favorable weather comes to 
the company free. 

3. The possibility that demand curves can frequently shift implies 
that a statistical investigation of the shape of such a curve requires the 
aid of relatively powerful methods. There is a serious difficulty in the 
obvious approach, which involves our taking price quantity data for a 
number of months and plotting them on a graph. For example, if in October 
OX units were sold at (average) price OP, (point Do in Figure 1) whereas 
November and December sales were represented by points Zo and Fo, 
respectively, this method would have us draw in the statistical “demand 
curve" SS', which, as we can see, really resembles none of the true demand 
curves (the DD’ curve for October, the EE' curve for November, and the 
FF' December curve). The difficulty is that the true demand curve has 
shifted over this period. The naive statistical method which has just been 
described does not even indicate this fact and it certainly offers us no means 
of correcting for it. 

More sophisticated methods have been designed to deal with this so- 
called identification problem, and, more generally, with the econometrics 
of simultaneous equation estimation. However, it is desirable to defer 
discussion of these techniques unti! later.” 


3. Elasticity: A Measure of Responsiveness 


The most obvious piece of information we desire of a demand function 
(or from economic relationships of other varieties) is an indication of the 
effect on the “dependent” variable of a change in the value of one of the 
other variables. In the case of the demand curve, this involves measure- 
ment of the response in quantity demanded which can be expected to 
result from a given change in the price of the commodity. 

The obvious measure of responsiveness is, of course, what we may call 
the marginal demand contribution of a price change, Az/Ap, or the corres- 
ponding derivative, dz/dp, the change (fall) in quantity demanded caused 
by a unit change (rise) in price. It will be observed, incidentally, that 
this measure is the reciprocal of the slope of the demand curve Ap/Az 
(or dp/dz), so that the flatter the demand curve, the greater will be the 
value of this measure of responsiveness to price change. This peculiarity 
results from the oddity in the conventional drawing of the demand curve 


?See the next chapter and its appendix. 
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which has already been noted—the fact that the value of the apparently 
dependent variable, quantity, is measured along the horizontal axis, and 
that of the independent variable, price, along the vertical axis. We might 
well get a better intuitive grasp of the degree of price responsiveness of a 
demand curve if the diagram were turned on its side. 

In any event, the obvious measures of responsiveness, Az/Ap and 
dz/dp, are subject to a drawback which has led theorists to employ 
instead another measure—elasticity. The difficulty with, say, Az/Ap is 
that it deals with the absolute changes in quantity and price, which makes 
it difficult to compare the responsiveness of different commodities. Com- 
modities are measured in different units—labor in hours, land in acres, 
and whiskey in fifths or quarts. There is no simple way of comparing a 
20,000-quart increase in the quantity of Scotch demanded with a 3,000- 
acre rise in the demand for land. In economics it is difficult, because of the 
very nature of the animal, to impose uniform units on all the relevant 
magnitudes as is done in physics. 

But the problem extends beyond the dissimilarity of units, because, 
even in the measurement of price change, the magnitudes are not readily 
comparable. Consider a 1-cent fall in the price of a package of bubble 
gum and an equal fall in the price of an automatic dishwasher. We might 
not be surprised to find bubble gum sales booming when habitues discover 
the bargain in this brand of the confection, but it is difficult to believe 
that a one-penny reduction in dishwasher prices would even be noticed. 
Though the measure Az/Ap would therefore almost certainly yield a much 
higher number in the case of chewing gum than in that of major household 
appliances (a much greater change in quantity demanded per penny price 
reduction), we would surely hesitate to conclude from this that the demand 
of the former was significantly more price-sensitive. 

Theorists have concluded, from such considerations, that an appropriate 
measure of responsiveness of demand to price changes should employ 
percentage rather than absolute change figures. À 1 per cent (rather than a 
one-penny) fall in price then becomes the standard of comparison, so that 
the change in dishwasher price in our illustration is discounted as an 
insignificant price fall in comparison with that of the bubble gum. 

Employing these percentage terms we have the definition 


price elasticity of demand for item X 


percentage change in quantity of X demanded 
E percentage change in the price of X 


The only peculiarity in the definition which remains to be explained is the 
presence of the minus sign before the fraction. This is inserted to make 
the elasticity number nonnegative. When the demand curve is negatively 
inclined, a rise in price (Ap positive) will lead to a fall in quantity (Az 
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negative) so that in our elasticity fraction the numerator and denominator 
will be of opposite sign. Therefore the fraction will be a negative number, 
and a minus sign is needed to make the number positive. The insertion of 
this sign in the elasticity formula is, then, just a matter of linguistic 
convenience. 

For our purposes it is necessary to define the elasticity measure some- 
what more specifically. The percentage change in any quantity, z, is 
defined as 100 times the change in z, i.e., as 100Az, divided by x. For 
example, if quantity rises from 10 to 15, we have Ax = 15 — 10 = 5 and 
the percentage rise in x is 100Az/z = 500/10 = 50 per cent. Similarly, 
the percentage change in p is given by the expression 100Ap/p. Therefore 
-we have, by our definition of elasticity, 


100Az, 
price elasticity of demand = E - E 
100Ap/p Ap/p 
(since we can divide both numerator and denominator by 100). 
Moreover, since division by a fraction, Ap/p, is the same as multiplica- 
tion by its reciprocal, p/Ap, we obtain the expression 


i za Az p Az p 

(1) price elasticity of demand = "hb g "E 

This expression, which will be used throughout the remainder of the 
elasticity discussion, helps now to describe two different elasticity con- 
cepts: point elasticity and arc elasticity. Arc elasticity is a measure of the 
average responsiveness to price change exhibited by a demand curve over 
some finite stretch of the curve such as DoD, in Figure 1. One complication 
is inherent in the concept. In the elasticity formula (1), when price changes 
from Po to P, it is clear that Az = Xi — Xo, the change in quantity 
bought (Figure 1), and that Ap = P1 — Po. But what are the values of 
z and p? Since a range of values of x occurs along are DoD, no unique 
value of this variable is called for by the definition. It is customary for this 
purpose to use the average of the two end values of z, that is, to set 
z = (X4 + Xo)/2, and to do the same for the percentage change in price. 
Hence the are elasticity of demand is defined by the expression 


Az p_ Xi— Xo (Pi + Po)/2 
Ap z Pı — Po (X1+Xo)/2 


so that, multiplying both numerator and denominator by 2, we have 


(2) arc (price) elasticity of demand = — XiX Pı + Po 
Py d Po Xi +X" 
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Point elasticity of demand is the corresponding concept for each 
particular point on the demand curve. But, at any such point there is no 
change in price (Ap = 0) or in quantity. We therefore define point 
elasticity in much the same way as the derivative concept in Chapter 4. 
That is, we take point elasticity to be the limit of the arc elasticity figure as 
the arc DoD, is made smaller and smaller, first being cut down to DoD», 
then to DoDs, etc. We thereby arrive at the definition 


dz 


2: 
dp x’ 


(3) point price elasticity of demand = — 


"3 


where the derivative dz/dp has been substituted for Az/Ap in the elasticity 
definition (1). 

Before leaving the question of definitions, it is well to point out that 
the elasticity concept can be (and has been) adapted to measure responsive- 
ness in variables other than quantity and price. For example, we may 
measure the responsiveness of the supply, s, of some commodity to a 
change in interest rate, 7, as 


percentage change in supply 


interest elasticity of supply — — T 
à percentage change in interest rate 


Similarly, when the price, p;, of one commodity , j affects the quantity 
demanded, z;, of another commodity k it is customary to define 


dz 3 
cross elasticity of demand == Se Be 
dp; Tk 

The reader should try defining such concepts as the income elasticity 
of imports and the interest elasticity of investment. 


4. Properties of the Elasticity Measure 


The basic elasticity formula (1) permits us to see a relationship between 
an elasticity measure and the corresponding marginal measure of respon- 
siveness, Az/Ap, with which the elasticity discussion began. Elasticity is 
simply the marginal measure multiplied by the fraction —p/z. 

This observation, in turn, helps us to see one of the peculiarities of the 
elasticity measure. Consider a straight-line demand curve like that in 
Figure 2. It is tempting to guess that the elasticity of such a demand 


ya 
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curve is the same throughout the length of the curve. Such constancy does 
hold for the marginal measure of price responsiveness, Az/Ap, since it is 
the reciprocal of the slope of the demand curve which does not change 
along a straight line. There are two cases in which the elasticity measure 
also behaves in this way: If a demand curve is vertical (a fixed quantity 
demanded no matter what the price, so that Az/Ap = 0), its elasticity is 
zero throughout, and at the other extreme, a horizontal demand curve 
has “infinite elasticity.” But in any other straight-line case, such as DD! 
in Figure 2, elasticity is not constant. Indeed, it varies continuously trom 
zero at point D' on the horizontal axis to any number as high as we like 
when we get close to the vertical axis (so that elasticity is said to approach 
infinity as we move toward point D)! 

The reason for the variability in the elasticity of a straight line is 
readily seen from our last elasticity formula. We have just noted that the 
first fraction in this expression, Az/Ap, retains the same value throughout 
the graph. But that is not true of the second fraction, p/x. At point D’, 
wehavez = OD' and p = Oso that p/z = 0, and hence the price elasticity 
of demand is zero also. As we move toward the left along the demand 
curve, the numerator of p/z increases while the denominator, x, approaches 
zero. Hence the value of the fraction grows larger and larger without 
limit and the same is consequently true of the price elasticity of demand.? 
We conclude that, except in the zero elastic vertical case and the infinite 
elsstic horizontal case, elasticity of demand is certainly not constant along 
a straight-line demand curve. This complication is a price which we pay for 
using percentage figures instead of absolute figures in the elasticity measure. 


3 Note that I have avoided speaking of the elasticity being infinite at point D, where 
z = 0. Here the elasticity is not even defined because an attempt to evaluate the frac- 
tion p/z at that point forces us to commit the sin of dividing by zero. The reader who 
has forgotten why division by zero is immoral may recall that division is the reverse 
operation of multiplication. Hence, in seeking the quotient c — a[b we look for à num- 
ber, c, which when multiplied by b gives us the number a, i.e., for which cb = a. But 


if a is not zero, say a = 5, and b is zero, there is no such number b: i 
she ei mde " er because there is no c 
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However, even here there is an important compensation. Though the 
connection may at first not be obvious, the following theorem will lead us 
to & type of curve whose elasticity 7s constant throughout and which will 
offer us some useful insights. 


Elasticity Proposition 1: Given any segment of the demand curve, a 
change in price within that segment will have no effect on the product pz 
(price multiplied by quantity demanded) if and only if the elasticity of 
demand throughout the range is exactly equal to unity. More specifically, a 
change in price from po to pı will yield pozo = piz; if and only if the 
elasticity of the arc DoD, is unity, and each and every intermediate price 
change will also leave pz unaffected if and only if point elasticity is unity 
at every point along this arc.* 


The product px represents the amount which the consumer would spend 
and which the seller would therefore receive if quantity z were bought at 
price p. This theorem therefore states that if the price elasticity of his 
demand is unity, a fall in price will induce the consumer to increase his 
purchases by exactly the amount needed to keep his total outlay the same 
as it was initially. This is certainly plausible intuitively, for we may view a 
price reduction as having an expenditure-increasing effect (more demanded) 
and an expenditure-reducing effect (a lower price paid for each unit 
purchased). When the elasticity of demand is unity, the percentage fall in 
price is, by definition, exactly equal to the percentage rise in quantity 
demanded, and it is therefore believable that these two effects will then 
exactly offset one another, as the theorem asserts. 

The theorem describes, implicitly, the type of demand curve along 
which elasticity of demand is constant. Specifically, it tells us that the 
elasticity will take the constant value unity along any curve characterized 
by the equation pz = K (any constant). Such a curve is called a rectangular 
hyperbola and has the shape of one of the curves depicted in Figure 3 
(where different curves correspond to different values of K). That a 


4 The following argument demonstrates both Propositions 1 and 2 in terms of point 
elasticity. The arc elasticity proofs are just matters of tedious algebraic manipulation. 

Proof (of Propositions 1 and 2): We want to determine the relationship between the 
elasticity, E, and the effect of a change in price, p, on total expenditure, pz = p-f(p), 
where z = f(p) is the equation of the demand curve. Then the effect of a change of a 
price on total expenditure is (by the formula for the derivative of a product) 


d) ufo. aD. lude UD. duy re qu M coe 
w » P2505 Bag ap Pp dri E zE-d-z 


= a — E). 


Hence, a change in price will leave xp constant, d[p-f(p)]/dp = 0, if and only if E = 1. 
Similarly, d[p * f(p)] > 0 if and only if E < 1 (demand inelastic), etc. 
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Figure 3 


curve pz = K is of such a shape can be seen by noting that if our demand 
curve is DD’, then, e.g., at price OPo, consumer expenditure OX, x OP) 
is represented by the area of the shaded rectangle OX oDoP (= height 
OP, multiplied by width OX). Similarly, at price OP, ex.:enditure is 
depicted by area OX,D,P). Since expenditure is constant along a demand 
curve of unit elasticity, it follows that all such rectangles must be equal in 
area. Hence the unit elastic demand curve must approach the axes of the 
diagram asymptotically, for as such a rectangle gets taller it must become 
narrower in order for its area to remain equal to that of its fellow expendi- 
ture rectangles. Moreover, such a demand curve must not touch either axis 
for at a point of intersection with an axis either p or z is zero so that pz 
must equal zero rather than K. It can be Shown, incidentally, that demand 
curves of constant elasticity 2 or $ or any other number are asymmetrical ` 
relative to the axes but roughly similar in shape to rectangular hyperbolas. 


We may use a geometric argument to extend our first elasticity theorem 
as follows: 


Elasticity Proposition 2: If a demand curve has elasticity less than unity 
(it is inelastic), a rise in price will increase consumer expenditure, pz, and 
vice versa. If the curve has an elasticity greater than unity (it is elastic), a 
fall in price will increase consumer expenditure and vice versa. 

A proof has already been provided in a footnote. 
theorems just given lie behind much of the use of the 
applied economics. They are met, for ex 
of taxation, international trade, 
simple illustration, note that it will not ordinaril 


it will also 
As a second 
e." In such a 


190 Demand Curves, Utility Surfaces, and Indifference Maps Chapter 9 


case, popular writers often recommend that the country devalue its cur- 
rency, thus making its products cheaper and hence leading foreigners to 
import more. There are a number of complications to be considered) but 
the one which is relevant for our purposes is the possibility that the 
elasticity of the foreigners' demand for that country's exports may be less 
than unity. Thus the country may find, after devaluing, that though it is 
shipping more abroad, it is actually earning less gold than before! 


PROBLEMS 


1. Given the definition of arc elasticity of demand as shown in Equation (2), 
prove that if total expenditure is constant along an arc, so that PyX1 = PX. 2, 
then the arc elasticity must be unity. 

2. If a firm’s price elasticity of demand is greater than 1, can you say from this 
alone whether a fall in its price is profitable? Why? 


5. Utility Analysis of Demand 


Economic theory has long sought to go behind the obvious and 
observable demand phenomena which are summed up in the demand 
function in an attempt to explain these observations in terms of the struc- 
ture of consumer desires. It seemed immediately apparent that there is some 
connection between demand and the utility of the commodity, i.e., the 
subjective benefit which the consumer obtains from its possession. But to 
classical economists this connection appeared to be limited largely to the 
fact that items totally without utility would not be demanded at all. To 
show that there is little or no connection with price, they called attention 
to the fact that water, which is essential to life and therefore to be con- 
sidered of very great utility, commands only a very low and often no more 
than a zero price, whereas diamonds, whose utility was said to be less than 
that of water, are notoriously expensive. 

This ‘‘diamona-water” paradox was explained by an analysis which 
was the focal point of the economic literature at the turn of the century. 
It was argued that the price of a commodity was determined not by its 
total but by its marginal utility. For this discussion it is convenient to 
evaluate the marginal utility of a commodity, X, in money terms (the 
amount of money the consumer is just willing to give up for another unii). 
The connection between price and marginal utility is that if to some 
rational consumer the marginal utility of some item, X. , when he holds A 
units of X is more than its price, he can increase his welfare by purchasing 
some more units of X. This is so because, by definition, in these circum- 
stances he receives more value than he gives up in such an exchange, 
Similarly, if the marginal utility of an Lth unit of the commodity is less 
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than its price, the consumer can benefit by buying less than L units. He 
should, therefore, always buy such an amount of X that its marginal utility 
is equal to its price.9 

The marginal utility theorists carried their analysis considerably 
further. For one thing, they argued, largely on introspective grounds, the 
more we possess of a commodity, the less we value an additional unit—the 
famous “law” of diminishing marginal utility. Partly, it was stated, this 
is so because we give priority to more highly valued uses—if we have only 
one piece of cake, we feed it to our child; if we have two, we divide it 
between husband and wife and a third we give to our mothers-in-law. 

The marginal utility analysis of pricing and the diminishing marginal 
utility proposition can quickly dispose of the diamond-water paradox. The 
relative scarcity of diamonds results in their having a high marginal utility 
and, therefore, a high price, while the relative abundance of water means 
that its marginal utility and, consequently, its price will be low despite its 
high total utility. 

This law of diminishing marginal utility was also used as an explanation 
of the negative slope which is alleged to characterize most simple demand 
curves. The argument is that if the marginal utility of a commodity falls 
when the consumer purchases more of the item, he can only be induced 
to buy more of a good by a fall in its price. 

Another important function of the law of diminishing marginal utility 
arises out of the need for second-order equilibrium conditions. It will be 
recalled (Section 5 of Chapter 4) that a marginal equation such as “price 
equals marginal utility” is not enough to guarantee that the consumer is 
getting the maximum possible utility for his money. There may be several 
purchase levels at which the equation holds. For example, referring back 
to Figure 2, we see that if the marginal utility curve has the peculiar shape 
of curve MM’ and price is OP, then there are three purchase levels Oa, 
Ob, and Oc at which marginal utility equals price. However, these are not 
all optimal purchase levels. In fact, two of these, Oa and Oc, are extremely 
disadvantageous to the consumer! If, for example, the consumer increases 
his purchase quantity from Oa (direction of an arrow), he enters a region 
where marginal utility exceeds price, and it will pay him to increase the 
amount he buys even more. Only when he gets to the true equilibrium 
point B (where marginal utility is diminishing—the curve of MM’ has a 
negative slope) does it pay him to stop increasing his purchases. Similarly, 


5 More formally, if z is the amount of X purchased, and if u(z) i ili 
of the purchase (measured in dollars), the consumer presumably =< els ie s 
difference between this total utility and his expenditure, pz, i.e., he seeks to ees de 
u(z) — pr. Differentiating with respect to z and setting the result equal to zer : 
obtain du/dz — p = 0, ie, p = du/dz, the marginal utility of z units of good X. 
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from quantity Oc it pays the consumer either to increase or decrease his 
purchases—not to stay at Oc (direction of the arrows). In sum, even if 
price equals marginal utility but marginal utility is increasing (points A 
and C), the consumer is at a point of minimum, not maximum, net gain. 
The “price equals marginal utility" condition only assures us that the 
consumer is on neither the uphill nor the downhill side of a total utility 
hill, but this means that he may be either at the top of the hill or the 
bottom of the valley (see Figure 4 of Chapter 4). Only if marginal utility 
is diminishing (as at point B) do we know that he must be at a point of 
maximum net gain.9 From B it pays him to move neither to the right nor 
to the left (see arrows at point b). Finally, if the law of diminishing marginal 
utility is valid, the entire marginal utility curve will have a negative slope 
(curve DD’ in Figure 2). There will then be only one point, E, where 
marginal utility is equal to price, and it. will always pay the consumer to 
move toward the corresponding purchase level, Oe (arrows). The law of 
diminishing marginal utility thus guarantees that there will be only one 
possible equilibrium level, Oe, and that it will possess an element of 
stability—consumers will always be motivated to move toward that point. 

At the beginning of this section we employed a monetary measure of 
marginal utility to make our comparison between the price of a commodity 
and its marginal utility. The marginal utility of X in money terms was 
defined as the maximum amount of money which a consumer is willing to 
pay for an additional unit of X. But the marginal utility theorists were 
generally dissatisfied with such a measure, for when money becomes 
scarcer, they maintained, its sübjective marginal value will increase, liké 
that of any commodity. Hence, an attempt to measure the marginal utility 
of X by asking the person how much money an additional unit is worth to 
him is like calculating length with a rubber ruler which stretches as we 
measure. Marginal utility must, according to this view, be measured in its 
own, subjective, units—we may call them utils. Some noted economists 
believed that subjective introspective experiments can be conducted suc- 
cessfully and that marginal utility, measured in utils rather than some 
directly observable unit (like money), can be known to diminish. That is, by 
thinking about our own feelings about additions to our holdings of, say, 
packages of spaghetti, we can come to be sure that additional packages are 


* This is of course the second-order condition—the requirement that if we are max- 
imizing, the second derivative of the maximand must be negative. In the current case 
(see the preceding footnote) the maximand is u(z) — pz whose first derivative with 
respect to z is mu, — p, where we write mu; for marginal utility of z. The second deriva- 
tive, then, is dmu./dz, which is required to be negative by the second-order conditions. 
That is, marginal utility must be declining, as the text asserts. 
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worth less and less to us in these absolute units (which corresponds to no 
objective experience that any of us has ever had). This view can be referred 
to as the neoclassical cardinal utility position.” 


6. Indifference Maps: Ordinal and Cardinal Utility 


Many theorists, who classify themselves as ordinalists, believe that 
measurement of subjective utility on an absolute scale is neither possible 
nor necessary. They question the validity of the introspective data of 
neoclassical cardinal utility and maintain that all consumer behavior can 
be described in terms of preferences, or rankings, in which the consumer 
need only state which of two collections of goods he prefers, without 
reporting on the magnitude of any numerical index of the strength of this 
preference. 

The geometric device employed to represent this sort of ordinal 
preference information is the indifference map (Figure 4a). In this diagram 


NUMBER OF 
CUMMER- 
BUNDS 


2 a 4 
SERVINGS OF ZABAGLIONE 
(a) 


Figuro 4 


quantities of different commodities are measured along the axes, so that, 
for example, point A on indifference curve II’ represents a collection of 
commodities consisting of one serving of zabaglione and four cummerbunds. 
It represents no more than this, and this datum, by itself, contains no 
information about the consumer in question. In particular, it does not mean 


7 That view is briefly discussed again in Chapter 17, where it is contrasted with 
von Neumann-Morgenstern cardinal utility, an entirely different sort of consiruct 
despite the similarity in nomenclature. Neoclassical cardinalism is also mentioned in the 
next section, where it is contrasted with the ordinalist position. 
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that he is indifferent between the four cummerbunds and the serving of Italian 
dessert. We note also that every possible combination of these two items 
can be represented by a point in this diagram. 

We may now define an indifference curve as the locus of points each of 
which represents a collection of commodities such that the consumer is 
indifferent among any of these combinations. For example, the presence 
of point B on curve II’ means that the consumer is indifferent between 
collections A and B, that is, between the combination of four cummerbunds 
and one serving of zabaglione (point A) and the combination which con- 
sists of two units of each of these items (point B). The indifference map 
consists of the infinite set of indifference curves such as IJ’ and JJ’ (there 
is, by assumption, one through every point in the diagram) of which only a 
few can be shown in any actual drawing. 

If, for reasons which will be discussed presently, we go along with the 
assumption that the consumer prefers combinations represented by points 
on higher indifference curves (e.g., he prefers collection C to A), the 
indifference map provides us with a complete and simple report on the 
consumer’s ordering of all possible combinations of the two items, for if 
two combinations are represented by points on the same indifference 
curve, the consumer is indifferent between them, and in any other case 
he prefers that collection which is represented by a point on a higher 
indifference curve. 

Let us digress briefly to see how the indifference map is related to the 
neoclassical cardinal utility representation of the consumer's tastes, leaving 
until Sections 14-19 a discussion of utility functions in an ordinal analysis. 
The three-dimensional Figure 4b shows the same consumer's utility surface, 
which is constructed as follows: Lay Figure 4a on a horizontal surface to 
constitute the floor of the diagram. Any point, such as B, on this floor 
again represents a collection of these two items. Now suppose we have 
somehow found out the number of utils which this coliection, B, can yield 
to the consumer. We erect over point B a flagpole BB', whose length is 
equal to the number of utils. Similarly, such a flagpole is erected above 
every point on the floor of the diagram representing the utility of every 
possible combination of the two items. For example, DU' is the utility of 
the collection of OD servings of zabaglione (and zero cummerbunds), 
whereas EU is the utility of OE cummerbunds. If we now stretch a canvas 
over the top of the collection of flagpoles, this canvas is the consumer’s 
cardinal utility surface, OU V U' (shaded surface). 

Since all combinations of consumer goods represented by points on an 
indifference curve II’ have equal utility, the flagpoles above such a curve 
must all be of equal height, i.e., the portion of the utility surface which 
lies directly above an indifference curve (such as IBI") must all be of a 
single height (curve 7B’z’). In other words, the consumer's indifference 
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curves are the contour lines (iso-utility lines) of his utility surface. They 
are the loci of commodity combinations of equal utility, just as the contour 
lines on an ordinary geographic map are loci of combinations of latitude 
and longitude of equal height above sea level.® 

However, to an ordinalist there is one important respect in which this 
geographic analogy does not hold. A contour line on an ordinary map is 
labeled by a number which indicates the height of its points above sea level. 
But an indifference curve bears no number to indicate the corresponding 
height of the utility surface—no cardinal utility number is attached to the 
curve. Hence indifference curves do not contain cardinal utility informa- 
tion—they only record preferences—the order in which the consumer ranks 
the various commodity combinations. From utility information we can 
deduce preferences; the consumer prefers the item whose utility is highest— 
but the converse is not true: The statement that the consumer prefers A to 
B gives us no numerical utility magnitudes. 


7. Properties of Indifference Curves 


The slope of an indifference curve has a significant economic inter- 
pretation. For example, in Figure 4a we see that the arc AB has the slope 
AD/DB. But in moving from point A to B the consumer loses AD (2) 
cummerbunds and gains DB (1) serving of zabaglione. Since A and B are 
indifferent, it must mean that the DB unit gain in his zabaglione holdings 
just compensates him for his AD unit cummerbund loss. Thus the absolute 
(i.e., positive) value of the slope, AD/DB = $t = 2, indicates that it takes 
one serving of zabaglione to supply heart balm to the consumer for the 
loss of two cummerbunds. This absolute value of the slope, called the 
consumer’s marginal rate of substitution of zabaglione for cummerbunds, 
therefore represents the number of units of the latter whose loss can be 
made up by a unit gain in the former. It is the consumer’s psychological 
rate of exchange between the two commodities. 

We can also show that this slope (which in the rest of this chapter is 
taken to mean its absolute value) is equal to the fraction (marginal utility 
of zabaglione/marginal utility of cummerbunds),® that is, the marginal 


8 If we describe the utility surface by means of a function u = f(zi,---, z,), where 
z: is the quantity of good ? consumed, then the equation of an indifference curve is 
S(zi,+++, 2n) = k (constant) with different indifference curves corresponding to different 
values of k. 

? Proof: If arc AB is sufficiently small, the utility loss involved in giving up AD 
units of cummerbunds is the marginal utility of such a unit (mu) multiplied by A D, 
the number of units involved, i.e., the loss in giving up AD = (A D) X (my,). Simi- 
larly, the utility gain involved in acquiring DB units of z is (DB) X (mu), 


"n 112. wh 
represents the marginal utility of zabaglione. Since the gain and the loss jus ies 


t offset one 
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rate of substitution of Z for C equals 
Us aeri 9. marginal utility of Z 
aise ~ Az] marginal utility of C 


where z and c represent, respectively, the quantities of zabaglione and 
cummerbunds. 
Two features of this result bear some discussion: 


1. In the equation 


Ac _ marginal utility of Z 
Az marginal utility of C 


it is noteworthy that c appears in the numerator of the left-hand fraction 
but in the denominator of the fraction on the right-hand side of the equation 
and that the reverse holds for z. This inverse relationship between Ac and 
the marginal utility of c is easily explained. Ac — AD units of C is the 
amount of C which the consumer is willing to give up for Az — DB units 
of Z. But the more valuable C is to him (the greater the marginal utility of 
C), obviously the less the consumer will be willing to give up in exchange 
for Az; i.e., the smaller will be Ac; hence the inverse relationship. 

2. A second thing to be noted is that marginal utility seems to have 
sneaked back into the analysis despite the ordinal nature of the indifference 
map. However, its return is not as serious as it may appear from the 
point of view of the ordinalist. Only the ratio of two marginal utilities ever 


another (points A and B are indifferent), we have AD X mu, = DB X mu,. Dividing 
both sides of the equation by mu, X DB we obtain the required result: 
mu,[mu. = AD/DB = the slope of II’. 
Alternatively, one can obtain the equation of the text from the expression for the utility 
function, u = f(c, z), and the formula for total differentiation (Chapter 4, Section 7). 
Since along an indifference curve total utility must be constant, we must have du = 0 or 
ou ou 
du = —de+—dz = 0. 
lu P + Ex dz — 0. 


` Bringing the first term over to the right and dividing through by dz au/ac we obtain at 
once 


which is the relationship in the text. 
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occurs in indifference analysis. In such & ratio we measure the marginal 
utility of one commodity not in terms of utils but in terms of the other 
commodity. We ask how much of C an additional unit of Z is worth (the 
marginal rate of substitution of Z for C). Thus we are, in effect, back to 
measuring marginal utility in terms of money, or some other commodity, 
and that is perfectly satisfactory to the ordinalist. 


In indifference curve analysis it is customary (at least implicitly) to 
make these assumptions about the psychology of the consumer: 


Assumption 1 (nonsatiety): The consumer is not oversupplied with 
either commodity, i.e., he prefers to have more of C and/or Z. 


Assumption 2 (transitivity): If A, B, and D are any three commodity 
combinations and if A is indifferent with B and B is indiffer nt with D, 
then the consumer is also indifferent between A and D. This condition 
simply requires that the consumer’s tastes possess a conceptually simple 
type of consistency. 


Assumption 3 (diminishing marginal rate of substitution): Consider two 
collections represented by points along the same indifference curve (e.g., 
A and E in Figure 4a). Then if at one of these points, E, the consumer has a 
relatively small supply of one commodity, C, and a relatively large supply 
of the other, then at E the marginal utility of the relatively searcer C will 
be large in comparison to that of Z, i.e., the consumer will there be willing 
to give up only à relatively small amount of C in exchange for an additional 
unit of Z. Thus, in Figure 4a, at point A the consumer is willing to give up 
AD units of C for an additional unit of Z. But at point E, where C is scarcer, 
he is only willing to pay the smaller number EF units of C for the same 
increment in his holdings of Z. 


These assumptions permit us to deduce four properties of indifference 
curves which normally characterize their drawings: 


Property A (by Assumption 1). An indifference curve which lies 
above and to the right of another represents preferred combinations of 
commodities. a 


Proof: Consider the indifference curves II’ and JJ’ in Figure 4a, and 
combination B on IJ’ and Q on JJ’. Since point Q is above and to the 
right of point B, it involves more of both commodites C and Z. Hence, b 
Assumption 1, the consumer must prefer Q to B, and therefore he me 


prefer every point on JJ’ (all of which indi i 
a 4 are indifferent with Q) to any 
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Property B. Indifference curves have a negative slope (by Assump- 
tion 1). 


Proof: Start, e.g., at point A in Figure 4a and move from it to the 
right so that the consumer holds more of commodity Z as a result. By 
Assumption 1 the consumer must prefer this new point (he cannot be 
indifferent between it and A) unless at the same time it involves his having 
less of the other commodity, C. In other words, if he is to be indifferent 
between the new point and A, it must lie below A as well as to its right, 
as does point B. 


Property C. Indifference curves can never meet or intersect, so that 
only one indifference curve will pass through any one point in the map 
(by Assumptions 1 and 2). 


Proof: Suppose on the contrary that two indifference curves, JJ’ and 
the dashed curve, were to intersect at point Q. Pick point K on the dashed 
indifference curve and point H on J.J’ where H lies above and to the right 
of K. By Property A (Assumption 1) H must be preferred to K. But H is 
indifferent with Q, and Q is, in turn, indifferent with K. Hence, by Assump- 
tion 2, H must be indifferent with K. Since H cannot be both indifferent 
with and preferred to K, the intersection of the two curves which led to 
this self-contradictory result cannot possibly occur. 


Prorerty D. The absolute slope of an indifference curve diminishes 
toward the right (the curve is flatter at point E than it is at point A) so 
that the curve is said to be conver to the origin (by contrast with SS’ in 
Figure 7, which is said to be concave to the origin). This theorem is a direct 
consequence of Assumption 3, which states that the marginal rate of 
substitution of Z for C [which, it will be remembered, is represented by the 
slope of the curve (neglecting minus signs)] is smaller at E than at A 
(Figure 4a). 


8. Violation of the Premises about Indifference Curves. Satiation and Lexico- 
graphical Orderings — ' 


While the shapes that have just been described are frequently assumed 
to hold and are extremely convenient analytically for reasons that will be 
indicated presently, they are necessarily valid only on the psychological 
assumptions listed at the beginning of the section, i.e., nonsatiety, transi- 
tivity, and diminishing marginal rate of substitution. Any or all of these 
conditions can be violated in reality and there is nothing necessarily 
pathological about such violations, as will now be shown. 


Lon 
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Assume that our consumer ultimately does become sated with zabag- 
lione—after the refrigerator and the freezer are filled with it the householder 
regards further quantities of the dessert with apprehension and perhaps 
with hostility. Suppose Z* is the maximal desired quantity of Z (and 
similarly, let C* be the satiation quantity of cummerbunds). What happens 
to the shape of the indifference curves beyond these quantities? 

In Figure 5a we see that rectangle OZ *SC* is the region of nonsatiation: 


SATED IN SATED IN 
C| C'Nor Pls 

ME A 
/ [ 


Any point in that region (which has been labeled region I) represents a 
combination of the two goods which leaves the consumer wanting more of 
either or both. In that region we see a normally shaped indifference curve— 
the solid locus AB. However, at any point in region II to the right of Z* 
but below C* (e.g., point L), the consumer still wants more of C but now 
he desires less of Z. Hence if he gets still more Z (the move from L to M) 
and yet remains indifferent, he must be compensated for the (repugnant) 
rise in-quantity of Z by a desired rise in C. Thus the indifference curve in 
this region must have a positive slope. The reader should verify that the 
same argument holds for region IV in which there is too much-C but more Z 
is still desired. However, in region III, where the consumer has more than 
he wants of either item, the indifference curve will again acquire its nega- 
tive slope since there, to compensate him for an addition in his unwanted 
holding of Z, he must be relieved of some of his unwanted C. That is, to 
leave him indifferent a rise in his Z holdings must be accompanied by a fall 
in his C, and vice versa. 

Figure 5b suggests more clearly what is going on by showing a set of 
Several indifference curves. It reveals them to be closed contours, one 
inside the other, converging to the saturation point, S (sometimes called 
the “bliss point”), at which the consumer possesses exactly the maximal 
amounts he wants of each of the two commodities. The indifference curves 
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can be taken as the contour lines of a smooth utility hill with a single 
maximum point, S, with the surrounding indifference curves representing 
decreasingly desirable possibilities as they move further from the bliss 
point. 

Thus we see 


1. The conventional indifference curves really are only segments of 
the complete indifference curves—those portions lying in region I, the 
region of nonsatiation (scarcity). This is, of course, the relevant region for 
most economic analysis since budgetary limitations do keep consumers 
from complete satiation in every commodity (even the wealthiest of absolute 
rulers has not been able to afford all the military equipment he wanted for 
his armies). 

2. In other regions the negative slope of the indifference curve need not 
hold. Moreover, the curve need not be convex to the origin (region ITI).!° 

3. At points B and T in Figure 5a (zabaglione satiation) the indifference 
curve is horizontal (a small change in Z neither adds to nor subtracts from 
his welfare, and so no change in quantity of C is needed to compensate 
him for such a change). Similarly, at points R and A (cummerbund 
satiation) the indifference curves are vertical. 


It is also possible to think of plausible cases in which the nonintersection 
property of indifference curves will be violated (intransitivities). This will 
occur, for example, where the consumer cannot distinguish small differences 


that is, they encompass 
a narrow area rather than a locus (curve) of zero thickness. 

Finally, we note that underlying the entire discussion is & premise 
rarely questioned in elementary texts—the assumption that such curves 
exist. But even that is not necessarily true. It is easy to describe an interest- 
ing preference relationship for which no Such curves exist. The standard 


1° Intuitively, as the quantities of Z rise and C fall as we move from point T toward 
Rin Figure 5a, further additions to the holdings of Z become increasingly unbearable, 
while further decreases in the excess holdings of C become less urgent. Hence, to get the 
consumer to accept further increases in his Z he must be compensated by ‘ever-larger 
declines in his C (increasing marginal rate of substitution). 

Actually, this sort of shape of indifference curve can occur also in region I, where it 
characterizes the behavior of an addict or a collector (the more of either commodity he 
has, the more urgently he wants even more of it). If addiction to zabaglione were to 
characterize its consumption, as we move toward point Z* additional units of this item 
will become very valuable and additional C comparatively worthless, i.e., the consumer 
will be willing to trade a small addition in Z for a great loss in C. 


SS eee er 
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counterexample, which is of interest in itself, is called a lexicographical 
ordering, i.e., it uses a ranking criterion analogous to that used in ordering 
words in a dictionary. Suppose a government of a very poor country with a 
mild climate considers two objectives: More food (zi) and more clothing 
(x2). Since starvation and malnutrition are serious problems and neither 
cold nor modesty (!) are considered pressing issues, the government prefers 
any increase in food output no matter what happens to clothing production. 
Then no increase in clothing output can make up for a unit decline in food 
output. However, the government does favor more clothing output for its 
decorative and amusement value, provided no food need be given up to 
get.it. Thus, if we start out with 8 units of food and 8 of clothing (point A 
in Figure 6), the government will prefer any point involving more food than 


oe. 


— 


INFERIOR REGION 
(INCLUDES DOTTED BOUNDARY) 


A E 


Q-46* elo I2 CLOTHING 
Figure 6 


A regardless of the associated quantity of clothing (c.g., points B, C, or D). 
It will also prefer any point on AE to the right of A (more clothing with no 
less food). However, it will consider inferior to A any point below A or on 
the line segment FA to the left of A. Thus, every point in the diagram 
other than A is either preferable to A or less desirable than A. There can 
be no second point that is indifferent to A, so that na indifference curve 
through A (or through any other point in the diagram) is possible, just as 
our discussion was intended to show. 


PROBLEMS 


1. Show the pertinent indifference curve and equilibrium point for a wealthy 
ruler who has all the zabaglione he can possibly want but wishes he could 
afford more military equipment. 

2. Explain why point F in Figure 6 is considered inferior to A but B, which is 
very close to F, is superior to A. 

3. Explain the analogy between the lexicographical ordering as described in the 
text and the ordering of words in a dictionary. 
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9. Price Lines: Consumer Income and Prices 


By itself, an indifference map cannot possibly predict consumer 
behavior because it leaves out two vital types of information—the income 
of the consumer and the prices of the commodities. The indifference curves 
do not ask the consumer which combination he believes will give him the 
most for his money. It is merely a hypothetical ranking of various com- 
modity combinations—perhaps castles in Spain against yachts in Portugal 
—taking no account of what the consumer can afford. 

Price and income information is supplied in'an indifference diagram by 
another curve which is called the price line or, sometimes, the budget line. 
Since the axes of the diagram present only quantities of commodities 
rather than amounts of money, dollar prices and incomes cannot be shown 
directly. Instead, the price line does the next best thing and indicates 
what amounts of the commodities a given amount of money can buy. 

For example (Figure 7), suppose $50 spent exclusively on commodity Z 

: will, at its current price, buy OP' units 
of that commodity, whereas the same 
amount spent entirely on C will purchase 
exactly OP units of that item. Suppose, 
moreover, that every point such as A 
on line PP’ represents a combination of 
the two commodities which sells for $50 
(e.g., $10 worth of Z plus $40 of C). Then 
line PP’ is a price or budget line. Such a 
line is defined as the locus of all com- 
binations of commodities which cost 

Figure 7 some fixed amount of money (e.g., our 
illustrative $50). 

If the prices of both commodities are fixed, that is, they do not vary 
with the amounts of the goods which are purchased, the price line will 
possess the following properties: 


1. It will be a straight line. 

2. It will have a negative slope. 

3. Its slope will be equal to the negative inverse of the ratio of the prices 
of the two commodities, i.e., we will have Ac/Az = —p./p., where p. 
and p, are the unit prices of Z and C, respectively. 

4. Suppose two price lines involve the same commodity prices but 
represent the expenditure of different amounts of money (say $50 for PP' 
and $30 for RR’). Then the two lines will be parallel. 


The equation of the price line is, in the fixed price case, given by a 
simple expression. If the consumer buys z units of commodity Z, his total 
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expenditure on this item will be p.z (the price per unit, p,, multiplied 
by the number of units purchased). Similarly, expenditure on the other 
commodity is given by p.c so that total expenditure is given by 


(4) pzt+pc=m, 


where m is the amount of money spent (our illustrative $50 for line PP’) 
and is therefore constant along a price line. This, then, is the equation of a 
price line.!! 

The four properties can readily be generalized to take account of price 
variability. There are two possibilities: Either that buying in quantity will 
make the commodities scarce relative to the quantities demanded and so 
raise their prices to the purchaser (as wages go up when the demand for 
labor increases), or, on the other hand, that he will be offered discounts if 
he buys in larger quantities (special today: one elephant, $200, or two 
for $325). The former possibility, which can be interpreted as a case of 
diminishing returns to an increased number of dollars spent by a large 
purchaser on a given commodity, will yield a curved price line which, like 
SS’, is concave to the origin (Figure 7). The reason is that as one moves 
toward the axes from an interior point such as D, a greater proportion of 
this large consumer’s fixed amount of money, m, is spent on one of the 
commodities; thus near S almost all of it is spent on commodity C. This 
raises the price of C against the consumer so that his m dollars will buy 
only OS—which is less than the OR units he could obtain for m dollars if 
the price of C were fixed at the level it is at point D. 

For a completely analogous reason, quantity discounts (increasing 
returns to increased expenditure on any one item) will result in a budget 
line which, like IJ’, is convex to the origin. 

There remains one point to discuss about price lines. What do they 
tell us about the consumer’s income and the prices of the various products? 
First, to deal with the information on consumer income which is conveyed 
by a price line, it is convenient to define the multicommodity analogue 
of a price line [Equation (4)]—the algebraic budget relationship for all of 


11 The four properties of the price line are readily derived from this equation. 
Dividing both sides by p. and rearranging terms the equation becomes 


If we now change our notation, writing y for c, z for z, a for —p,/p., and b for m/p., 
this becomes the standard linear equation of Chapter 2, y — az + b, with (negative) 
slope a = —p,/p.. The four price-line properties follow directly from this result, as the 
reader should verify. 

Note also that m/p. = the total amount of C the consumer could purchase if he 
were to spend all of his income on that item. 
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the (say, 1,257) different commodities which the consumer buys or 
considers buying: 


Pıtı + Poke +*+ + + P125721257 = M, 


where, e.g., z2 is the quantity of commodity number 2 purchased and pz 
is its unit price. In such a multicommodity budget equation it is con- 
venient to consider savings to be one of the 1,257 goods which he buys or 
ean buy for his money. On this interpretation, the consumer has no choice 
but to spend all his money (either on savings or on some other commodity) 
and the only relevant price line is the one which uses up all of the funds 
which he has available to him. This price line, then, specifies the consumer’s 
real income (or wealth). It tells us just what combinations of commodities 
he can afford to buy, given prices and his money income. 

So much for the income information supplied by a price line. Let us 
now see what the price line tells us about prices. 

Property 3 states that the slope of such a line tells us the ratio of the 
prices of the commodities. If the slope is —2, we know that the price of Z 
must be twice the price of a unit of C (note again the inverse relationship, 
—pi/p. = Ac/ Az). 

To summarize, the price iine specifies the real purchasing power which 
is available to the consumer and the ratio of the prices of the two com- 
modities. But since monetary quantities are not shown anywhere on the 
diagram, it is impossible for the price line by itself to specify either the 
level of the consumer’s liquid assets or the money price of any commodity.! ? 


10. Equilibrium of the Consumer 


The consumer who wants to get the most for his money will want to 
land on as high an indifference curve as his purchasing power permits— 
the highest indifference curve which can be reached from his budget line. 
This optimal purchase combination is given by the point of tangency,T, 
between the price line and indifference curve II’ (Figure 7), for it is clear, 
by inspection of the diagram, that any other point on the price line, such 
as B, will be intersected by an indifference curve which lies below IJ’. In 
this way, the indifference map together with the price line permit us to 
predict the demand pattern of the “rational” consumer—the consumer 
who spends his money efficiently in the pursuit of his needs and interests. 
We say that T is a point of equilibrium because once the consumer arrives at 


12 However, if we know any one of these values, the others follow at once. For 
example, if the price of C is known to be $10 and the price line shows Z to be twice 28 
expensive as C, then the price of Z must obviously be $20. Similarly, since his money buys 
OP units of C at $10 per unit, his expendable money must be OP times $10. 
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the decision to purchase the combination of commodities represented by 
that point, he has no motivation to revise his purchase plans. 

The tangency condition of equilibrium immediately yields another 
equilibrium condition. At their point of tangency the slope of the price 
line and that of the indifference curve must, by definition, be equal. But 
we know that the (absolute value of the) slope of the budget line is equal 
to the (inverse) ratio of the two prices, whereas the slope of the indifference 
curve is equal to the (inverse) ratio of the two marginal utilities or to 
the marginal rate of substitution of Z for C. Therefore, in equilibrium we 
must have 


A = L — marginal rate of substitution of Z for C. 


This is the marginal condition of equilibrium of the consumer. Tt resembles 
the neoclassical equilibrium condition that price must equal the marginal 
utility of a commodity, but states, instead, that the ratio of the marginal 
utilities of two commodities must equal the ratio of their prices. 

The logic of this condition is easily demonstrated. Suppose the condition 
is violated so that, e.g., the first of these fractions is greater than the 
second. Then, multiplying both sides by the presumably positive number 
mu,/Ds, we obtain the inequality mu,/De > mus/p.. But if item C costs, 
e.g., De = $5 per unit, we can for $1 obtain 1/5 = 1/p. units of this item, 
and (1/5)mu, = (1/p.)mu, therefore represents the utility which can be 
obtained spending an additional dollar on C. The last inequality therefore 
states that the consumer can acquire more utility out of an additional 
dollar spent on C than from another dollar spent on Z. If this is so, he 
cannot possibly be getting the most for his money—he can get more by 
reallocating his funds, spending less on Z and more on C. This is illustrated 
in Figure 7, where we note that at point B the absolute value of the slope 
of the indifference curve is less than that of the price line (the indifference 
curve is flatter) so that we have mu,/mu. < p./p. and so, as before, 
mu,/p, < mu./p.. It therefore pays the consumer to plan to buy less of Z 
and more of C, i.e., for him to move upward and to the left along the 
price line from B toward the point of tangency T. We see then that B 
violates our equilibrium condition and that it does so in a way which motivates 
the consumer to move toward the equilibrium point T. Thus with curves 
of the usual shape (as in the diagram), the equilibrium point possesses’an 
element of stability. From any other point on the price line the consumer 
is motivated to move in the direction of the point of tangency. 

Indeed, the shape we have assumed for the indifference curves plays 
an important role in our tangency solution. If any one of the four properties 
of indifference curves (listed in Section 7, above) were violated, consumer 
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equilibrium would not occur at a point of tangency. Thus, if Property A 
were violated so that the consumer wished, say, to be on the lowest attain- 
able indifference curve, his optimal point would be P rather than T, i.e., 
he would end up spending his money exclusively on one commodity. If 
Property B were violated so that the slope of the indifference curves was 
not negative, there could be no point of tangency with the negatively 
sloping price line. If Property C (nonintersectability of indifference curves) 
were violated, a number of points of tangency might occur (Figure 8a), 
and if the indifference curves were concave to the origin, in violation of 


c 
P 


o o 


(a) (b) 
Figure 8 


Property D, the point of tangency would yield the lowest attainable 
indifference curve, whereas the highest indifference curve would lie at one 
of the end points of the price line (P’ in Figure 8b), so the rational con- 
sumer would again end up spending all of his money on just one commodity! 
Note that at the point of tangency in Figure 8b the consumer is at the 
point of minimum utility on his price line.! 3 


PROBLEM 


Show that if indifference curves are positively sloped the optimal point is 
likely to occur at a corner (an end point of the price line). Interpret this case 
in terms of Figure 5. 


11. Responses fo Price and Income Changes 


If the income of the consumer increases, his budget line will retain its 
slope (relative prices remain unchanged) but that line will then shift 


13 What has gone wrong here is that while at the tangency point the first-order 
conditions for a maximum are satisfied (MRS equal to the ratio of prices), the second- 
order conditions are those required for a minimum rather than a maximum (i.e., it is as 
though, in a one-variable function, the second derivative were positive). 


Part 2 Demand Curves, Utility Surfaces, and Indifference Maps 207 


upward (he can get more goods for his increased money supply). In other 
words, income changes cause parallel shifts in the budget line, and a set 
of parallel budget lines (Figure 9a) shows how the consumer’s possible 
purchases will vary with changes in his income. On each such line we can 
find the equilibrium point of tangency (points Tı, T2, etc.). The curve OW, 
which is the locus of all such points, shows how the consumer’s purchases 
of the two commodities will vary when his income changes. Such a curve 
is called an income-consumption curve or, sometimes, an Engel curve 
(named after an early student of the effects of income changes on consumer 
expenditure patterns). 

Normally, consumers may be expected to increase their purchases of 
commodities as their incomes rise. But sometimes, if an item is of low 
quality, demand for it will drop as the consumer’s financial position 
improves and more desirable commodities are substituted. for it. Such an 
item is called an inferior good. Plausible examples of inferior goods are 
recapped automobile tires, poorly made clothing, poor cuts of meat, etc., 
any of which the consumer may be buying only because he can afford no 
better. In Figure 9a commodity Z is taken to be an inferior good. This is 
shown by the relative positions.of points T3 and T, (the negatively sloping 
segment of OW). The latter point lies to the left of the former (it represents 
a lower quantity of Z) despite the fact that it (T4) is on & higher budget 
line and therefore involves a higher income for the consumer. 


Figure 9 


Next, we can investigate the effects on the consumer's purchases of 
changes in the price of one of the commodities. Suppose the price of Z 
falls, other things remaining equal. This means that the buyer can get more 
of this commodity for his money (e.g., OP; instead of OP» in Figure 9b), 
though he can only obtain the same amount of C as before (quantity of OP). 
We see, then, that a fall in the price of the item on the horizontal axis leads 
the price line to flatten out by swinging to the right. Figure 9b represents a 
number of such price lines and the corresponding equilibrium tangency 
points. Curve PV, the locus of these points of tangency, shows how changes 
in the price of Z affect the purchases of both commodities. PV is called a 
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price-consumption curve or sometimes, particularly in international trade 
theory, an offer curve. 

It will be noted that the income-consumption curve, OW, begins at the 
origin (point O) because with zero income the consumer can buy none of 
‘either commodity. By. contrast, the price-consumption curve, PV, 
characteristically begins at point P, the pivot point of the swinging price 
line in Figure 9b. The reason is that, as the price line approaches the vertical 
axis (the price of Z increases further and further), the consumer finds that 
he gets less and less of Z for his money. Eventually, when its price goes 
high enough, the consumer will be forced out of buying Z altogether and 
he will therefore spend all his money on the remaining commodity, C, i.e., 
he will buy OP units of C and no Z (point P). 

'The offer curve construction can readily be translated into an ordinary 
demand curve for the consumer!* if one of the commodities represented in 
the diagram is M, the money held by the consumer (Figure 10a). By this 
device money values are inserted into the indifference map. As before, let 
PV be the offer eurve so that if the consumer buys zero units of the 
commodity he will have $30 for himself (point P). Now consider point A on 


PRICE 


Figure 10 


the offer curve which represents the consumer possessing z = 1 unit of 
commodity Z and m = $12. Since in moving from P to A he acquired 
1 unit of the good but gave up 18 = 30 — 12 dollars, the price per unit at 
A must be $18. Thus, point A states that the consumer will buy 1 unit of 
the commodity if its price is $18. This information is recorded by point a 
in Figure 10b. Similarly, point B on his offer curve involves the buyer’s 
spending 20 = 30 — 10 dollars on two units of the good so the price per 
unit must be $20/2 = $10. Hence (point b in Figure 10b) the offer curve 


14 Jt should be noted, however, that even if the consumer has pref«rence patterns 
for which an indifference map exista (cf. Section 8) it need not follow that a correspond- 
ing demand function existe unless the (ordinal) utility function is differentiable. 
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tells us that he is prepared to buy two units if the price is $10. Points c, e, 
and f in Figure 10b are derived similarly. These are clearly points on the 
consumer’s demand curve since they indicate how many units he is 
prepared to buy at different prices. DD’, the locus of all such points, is the 
demand curve for this consumer. 

It is noteworthy that the offer curve gives us information about the 
elasticity of the demand curve. For example, inspection of PV tells us that 
to the left of point c the demand curve DD’ must be elastic. To see why 
this is so, note that the unit price at point B ($10) is lower than that at 
A ($18) but that total consumer expenditure on the commodity at B 
(BB' = $20) is greater than that at A (AA’ = $18). Thus a fall in price 
has produced a rise in total expenditure (the price-consumption curve has 
2 negative slope). By elasticity Proposition 2 in Section 4 of this chapter, 
expenditure will rise when price falls only if the demand curve is elastic. 
The reader should have no difficulty showing that DD’ is unit elastic at 
horizontal point c and inelastic to the right of point c. 


12. Income and Substitution Effects: The Slutsky Theorem s 


It is customary to analyze somewhat further the effect on purchases of 
a change in the price of one of the commodities. The effect, for example, 
of a fall in the price of Z is classified into categories: the income and the 
substitution effect. Its lower price makes Z a better buy relative to C than 
it was before, and, as will be shown presently, that consequence by itself 
would always induce the consumer to increase his purchase of Z (the 
Slutsky theorem). This price-ratio portion of the effect of a price change on 
purchases is called the substitution effect. Purchases of Z will be substituted 
for those of C because Z is relatively more price-attractive than it was 
initially. 

But the fall in price of Z also affects the purchases of both commodities 
in another way—it increases the purchasing power of the consumer's 
income. This will, in turn, always increase the purchases of both commodities 
provided that neither of them is an inferior good, the demand for which is 
reduced by an increase in real income. The income effect, then, is the effect 
on the consumer’s purchases of the rise in real income which results from a 
fall in the price of commodity Z. Note that the income effect refers to the 
resulting change in his purchases and not to the change in his real income. 

To summarize, à fall in the price of any commodity, X, will affect the 
consumer's demand for X. This effect may be subdivided into two parts: 
the substitution effect, which always increases the demand for X, and the 
income effect, which will increase the demand for X unless X is an inferior 
good. Thus, ignoring this exceptional possibility, the demand curve for X 
must have a negative slope, i.e., a fall in the price of X must increase the 
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demand for that commodity. Even if X is an inferior good its demand curve 
will still have a negative slope unless the income effect is stronger than the 
substitution effect, for, as will soon be shown, the substitution effect of a 
lower price of X is always a rise in the demand for X. In addition, in 
practice, the income effect for most consumers’ goods is likely to be small 
because a buyer's outlay on any one commodity constitutes a relatively 
small Rroportion of his budget, so a fall in the price of that item alone will 
not increase his real income significantly.15 

Two different graphic depictions of the income and substitution effects 
have been employed in the literature. In Figure 11a and 11b let PP’ and 
PP” be two price lines involving different prices of commodity Z, and let 
A and B represent the (tangency) equilibrium points on the two price 
lines. The total effect of the price change on the amount of Z purchased is, 
therefore, ab. Our object is to divide ab into two parts—the income effect 
and the substitution effect. For this purpose the change in position of the 
price line is divided artificially into two parts: a parallel shift (a change in 
real income with no change in relative prices) and a pivot or twisting 
(change in slope) of the price line (a change in relative prices with no 
change in real income). To accomplish this division we employ an imaginary 
price line RR’ in Figure 11a (or SS’ in Figure 11b) which is parallel to one 
of the price lines (they have the same relative prices) and is, in some sense, 
at the same real income level as the other. Here is where the ambiguity in 
interpretation occurs (the source of difference between the two diagrams). 


Figure 11 


15 But, at least as a remote possibility, we see that a very inferior good for which 
the income effect is very high provides another possible case of a positively sloping 
demand curve. The other two cases which were mentioned in Section 1 of this chapter 
(snob appeal and quality judged by price) do not show up in the usual indifference map 
analysis because each of these involves the consumer’s preference structure being 
changed by the price change. He values platinum collar stays or a brand of frozen chop 
suey more highly when its price rises. In other words, the consumer’s indifference curves 
shift when there is a swing in the price line—a possibility which has not been considered 
in the text. 
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When do two price lines, which are not identical, represent the same real 
income? One highly persuasive solution is to say that this occurs when 
they both yield the same satisfaction to the consumer, ie., they are both 
tangent to the same indifference curve, as are PP’ and SS’ in Figure 11b 
(they are tangent, at points A and D, respectively, to indifference curve 
JJ"). There is another solution, which is perhaps less satisfying intuitively 
but which is very useful and which we will-need presently. This is to say 
that RR’ (Figure 11a) yields the same income as PP’ if RR’ passes through 
point A so that it just gives the consumer enough money to buy combina- 
tion A, the combination he would buy if PP' were in fact the prevailing 
price line. In this case, the point of equilibrium, D, on the imaginary price 
line RR’ lies on an indifference curve IJ’, which is not tangent to PP’. 
Indeed, since line RR’ in Figure 11a is higher than SS’ in 11b, the in- 
difference curve II’ to which RR’ is tangent must lie above indifference 
curve JJ’ in 11b, which is tangent to both SS’ and the original price line, 
PP’. 

The income and substitution effects can now be read off from the 
diagrams. The substitution effect is ad, the change in purchase of Z which 
results from the twisting of the imaginary price line, whereas the income 
effect is db, the effect of the parallel shift in the price line. 

In this two-commodity analysis. Figure 11b can be used to show that 
when the price of z falls (the price line flattens out), the substitution effect 
must lead to a rise in the demand for Z, for SS’ and PP’ are both tangent 
to the same indifference curve. But since SS’ is the flatter price line, its 
point of tangency, D, must occur to the right of A, the point of tangency 
of PP’ (because the slope of an indifference curve gets smaller toward the 
right). Hence, with the lower relative price of Z (RR’ or SS’) the demand 
for Z(d) will be greater than the demand for Z when the price line is PP’. 
Unfortunately this argument is not valid when there are more than two 
commodities so that tne consumer’s preferences cannot be summed up in a 
two-dimensional indifference map. Presently, more general proofs of this 
result, the Slutsky theorem, will be presented (Chapter 13, Section 8). 


13. The Role of the Income Effect 


The reader may well wonder what the fuss is all about—why the mere 
classification of the effects of a price change into two portions—the 
substitution and the income effect—should have elicited so much attention 
in the literature. The essence of the matter is that Eugen Slutsky and, 
after him, J. R. Hicks and R. D. G. Allen discovered independently that 
such a price effect has two portions one of which, the substitution effect, is 
predictable in sign and in many of its other characteristics. For example, 
we have just seen that the substitution effect of a rise in the price of X on 
the quantity of X purchased will always be negative (the Slutsky theorc 
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But these authors noted that there is another portion of the overall effect 
(the income-effect portion) whose behavior is unpredictable in general— 
and it must therefore be stripped away so that the systematic and pre- 
dictable behavior of the substitution effect can be revealed. This discovery 
can be likened to a filter which eliminates the static from the transmission 
of sound so that the underlying message can be made out. 

This, then, is the unexalted role of the income effect—it is discussed 
primarily in order to permit us to remove it. One should not be misled 
by the subclassification of possible values for the income effect—the 
statement that for “normal” goods it has one sign and for “inferior” 
goods it has another. This is only a little more than the use of nomenclature 
to put a better appearance on our ignorance. What this last subclassifica- 
tion asserts, in essence, is that the income effect can go either way and 
that we can think of realistic examples of both cases. Hence, we can make 
firm predictions about demand reactions to price changes only if this 
undependable portion of the price effect has been removed. 

Much of more sophisticated consumer theory proceeds accordingly, 
discussing matters in terms of “net” concepts—after removal of the 
income effects—rather than the corresponding gross concepts which 
correspond more closely to the observable data but whose behavior patterns 
vary in a manner that defies generalization. 


14. Complements and Substitutes 


The distinction between substitute and complementary commodities 
is easily grasped intuitively. Vodka and gin are substitute commodities— 
they serve the same general purposes, and if we have more of one, we will 
tend to want less of the other. On the other hand, bread and butter or gin 
and vermouth are complements—for many consumers they are better 
together and hence an increase in the availability of one tends to stimulate 
the demand for the other. But how does one measure these relationships? 
One straightforward approach makes use of the cross elasticity of demand— 
the effect of a change in the price of X 1 on the demand for Xo. If goods are 
substitutes, we expect the cross elasticity to be positive, and we expect the 
reverse if they are complements, for if X 1 and X; are substitutes, a rise 
in p; will decrease x, (the quantity of X, demanded), and as a result the 
consumer will'seek more of the substitute. Consequently, the rise in pı 
will lead to an increase in z2, and so the cross elasticity, (dx/dp1)p1/z», 
will be positive. The reverse will be true of complements, for the rise in 
pı will decrease zı and hence decrease zo. 

Or will it? The answer is that it always will!9 unless the ambiguous 


16 For proof of this statement see Chapter 14, Section 9, Proposition 10. 
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income effect messes matters up once again. For example, suppose the 
consumer is a relatively impecunious martini drinker. Then a rise in the 
price of gin may lead him to increase his purchases of cheap vermouth, 
which he will use instead of the imported variety. It is clearly the income 
effect which stimulates his consumption of the inferior good, cheap ver- 
mouth, as the rise in price of gin reduces the consumer’s real income. Thus, 
though the goods are complements, their cross elasticity will in this case 
be positive. Similarly, a rise in the price of hamburger may lead a poor 
family to decrease its demand for steaks even though they are substitutes. 

There is worse to come: Because of the asymmetry of the income 
effect the cross elasticity of demand for good 1 with respect to the price 
of good 2 may be positive and yet the elasticity of demand for good 2 with 
respect to the price of good 1 may be negative, for one good may play a 
small part in the consumer’s budget and so a rise in its price will have a 
negligible income effect while the reverse may be true of the other good. 

To make it easier to describe these cases economists use the following 
classifications: 


1. Good 1 is a gross substitute for 2 if its cross elasticity of demand 
with respect to pa is positive, 

2. It is a gross complement if that cross elasticity is negative, 

3. It is a net complement if the cross elasticity is positive after the 
income effect is removed, 

4. It is a net substitute if the cross elasticity is negative after 
removal of the income effect. 


15. Compensated Demand Curves 


The elimination of the income effect has also been carried out for 
demand curves, and so much of recent analysis has been carried out in 
terms of a compensated demand curve, i.e., the demand curve after adjust- 
ment to remove income effects. This curve describes the result of the 
conceptual experiment described in the following steps: 


a. Start from some initial price-quantity combination. 

b. Consider some alternative price, e.g., a price higher than the 
initial one. 

c. Adjust the consumer’s income so as to leave him with the real 
purchasing power he possessed initially; e.g., if price is increased, 
he must be compensated by an increase in income sufficient to permit 
him to purchase the initial quantity combination, should he choose to 
do so. 

d. Now examine the effect of the price change on his purchases 
after the compensation for income effect (step c.). 
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Graphically the process looks as shown in Figure 12. Here DD’ is 
the ordinary (uncompensated) demand curve. Suppose the initial price- 
quantity combination is Pa, Ta and we consider the consumer's behavior at 
the alternative price, ps. Without compensation his purchases would be 
reduced to z». But to compensate him for the erosion of his purchasing 
power stemming from the price rise, the consumer is provided a (con- 
ceptual) infusion of income. If X is not an inferior good, this means his 
purchases will not fall quite as much as if he had received no compensation, 
i.e., instead of going from za to z; they will decrease only to ze. This means 
that AC is a compensated demand curve through point A. Thus, the com- 
pensated demand curve for a rise in price will generally lie to right of the 
ordinary demand curve (except at the initial point), provided the com- 
modity in question is not an inferior good. 


Xp XcXo Xo x 
Figure 12 


We note at once that there is not just one single compensated demand 
curve—indeed, there will be a different one for every initial price-quantity 
combination, i.e., for each initial point (like A) on the ordinary demand 
curve DD'. Moreover, there are also compensated demand curves for 
price decreases, and these (like curve BE) will usually lie to the left of the 
ordinary demand curve. For if price falls, in the absence of compensation 
the consumer's real income will rise. Thus, after enough income has been 
taken away to offset this gain, if the good is not inferior he will buy less 
than he would have otherwise. For example, with the lower price, Pa, 
substituted for py the uncompensated quantity demanded will rise to 
Ta, but with the (negative) compensation to offset the rise in real income 
accompanying the price fall, quantity demanded will rise only to z.. 


16. Ordinal Utility Functions: Monotonic Transformations 


Even in an ordinalist analysis it is often convenient to conduct the 
calenlations in terms of utilitv functions rather than an indifference map. 
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By approaching the consumer's decision as a matter of utility maximiza- 
tion subject to a budget constraint one is enabled to use all of the math- 
ematical apparatus of constrained maximization, the powerful Lagrangian 
techniques, their extension in Kuhn-Tucker methods, etc. 

But what utility function can an ordinalist use, since he does not believe 
that absolute psychic utility can be measured? The answer is that for any 
given indifference map there is ordinarily!” an infinity of utility functions 
any one of which will do just as well as any other. Given two indifference 
curves A and B with the latter preferable to the former, we can say 
arbitrarily that any output combination on the former offers 7 utils and 
the latter 11 utils or instead we can say that A provides 3 utils and B 
provides 59 utils. As long as points on the preferred indifference curve 
are assigned the higher utility numbers they provide all the information 
on preferences that the ordinalist needs for his calculations. We know 
that above any indifference curve the utility surface is level and that the 
surface gets higher as we move to higher indifference curves, but no more. 
The actual height of the utility surface above any indifference curve is 
left completely unspecified, so that any of an infinite number of utility 
surfaces is usually consistent with any given indifference map—that is, 
all of the surfaces in the set will give us the same indifference map. Thus, 
to an ordinalist the surface in Figure 4b represents only one of the infinite 
number of utility surfaces consistent with the given indifference map. 

The switch from one such acceptable utility function to another is 
said to involve a monotonic transformation. A transformation may be 
described as the replacement of one set of numbers by another. A trans- 
formation is monotonic if a higher number in the first set is always replaced 
by & higher number in the second set. Table 1 illustrates the relationship. 


TABLE 1 
i ÓÓÀ 
Goods Collection a as as a4 as 
mecum we uu o MERE ooa RR 
First set of utility numbers 3 7 9 11 15 
Second set of utility numbers 1 2.6 55 59 60 
Third set of utility numbers 3 7 10 8 15 


SS Ed qe uiui E a 


We see that the replacement of the first set of numbers by the second is, 
indeed, a monotonic transformation since whenever the numbers in the 
first row increase, the numbers in the second also increase. On the other 


17 There are some cases that may be considered pathological in which a peculiar 
indifference map precludes the existence of any utility surface consistent with it. This is 
the so-called integrability problem—an indifference map which permits no utility 
function is called nonintegrable. 
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hand, the replacement of the first set of numbers by the third is not 
monotonic, for the 9-util entry in row 1 is replaced by (transformed into) 
a 10-util entry in row 3, while the 11-util entry is transformed into an 8-util 
figure. Thus, the figures in row 1 assert that goods combination a4 is 
preferable to a3, whereas the utility figures in row 3 indicate that the 
preferences are reversed. 

If we use u = f(zi,--., Zn) to describe any one of the infinity of utility 
functions for a given indifference map, then any monotonic transformation 
of this function, i.e., any other acceptable utility function, is written 


u* = g(u) ^ dg/du > 0, 


where g is any function of u whose value increases whenever u increases. 
But that is just what is meant by monotonicity, and it is also precisely 
what is meant by the dg/du > 0. It is important to note that any mono- 
tonic transformation of a utility surface will leave the indifference curves 
"unchanged.18 


We will see next that the class of utility functions acceptable to an 


ordinalist is characterized by a property called quasi-concavity, which is 


extremely useful analytically. But in order to define the concept we must 
first deal with a preliminary matter. 


17. Interior Points on a Line Segment"? 


18 Tt is easy to prove that the indifference 
between u and u* asa utility function. The sha; 


. du*|dz, x _ (dgldu) auJàz, _ 8u[àzy 
du*|dz, ^ (dgldu) dujazs ^ ~ auJoz," 
"Therefore, both utility functions necessarily yield the same number for the slope of the 
indifference curve through any point in the indifference map. 
19 The remainder of this chapter is made up of relatively advanced material. 
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will be a weighted sum of the corresponding coordinates of A and B, with 
a fixed weight, k, applied to each and every coordinate of A, and the weight 
1 — k assigned to each and every coordinate of B, and where k is some 
number between zero and unity. More specifically, we have (in the two- 
dimensional case) the rather tedious but important 


y d f(x) 
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Figure 13 

Proposition 3: Let point C with coordinates (y., te) be any point on 
line segment AB. Write x. as a weighted average of Ta and T» so that 
£e = kta + (1 — k)zm, where O < k <1. (z. is then called a convex 
combination of £a and x.) Then, the y coordinates of points C, A, and B 
will satisfy the corresponding equation with the same value of k, i.e., we 
will have?? y, = ky, + (1 — k)ys. Moreover, k/(1 — k) equals the ratio 
(zs — z)/(z. — Ta). 


It should be noted that the same result holds in n-dimensional space: 
Let A and B be two points with respective coordinates (x10, --, tno) and 


29 Proof: The triangles AEC and CFB are similar. Hence 
(i) (2s — 2) /(@e — Ta) = FB/EC = FC/EA = (ys — yl. — ya). 


But substituting for z, the expression kz, + (1 — k)z», as given in the text, the first of 
the preceding fractions becomes 


(2o — %2)/(@e — Za) = [2s — kta — (1 — K)zifllkzs + (1 — k)zs — za] 
ea k(zs — Ze) 
(1 — k)(@s — za) 
Hence by (i) we must also have 
(yo — ye — Ya) = ki(1 — k) or (ys — ye) — (kys — kye) = kye — kyo, 
which gives us our result: ye = kya + (1 — k)ys. 


= kk. 
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(x18, ---, Zne) and if C is any point in the interior of the line segment 
connecting A and B, then there exists a number k such that 0 «ck <1 
and such that if z;, is the ith coordinate of C, then zi, = kta + (1 — k)za 
for each and every 7. 


18. Concave and Strictly Concave Functions 


` Section 5 of Chapter 7 offered intuitive definitions of concave and 
convex’ funclions.?! Using the customary frame of reference—the shape 
of the surface as viewed from the floor of the diagram—we can envision a 
concave function as one having the general shape of an inverted bowl, 
while a convex function is shaped like an upright bowl. When a function 
is concave, if we take any two points on the surface of the bowl and connect 
them by a line segment, it is clear intuitively that every interior point on 
that connecting line segment will lie beneath the surface of the bowl. 
This characteristic is used by mathematicians to define a concave function. . 
Specifically, the theorem on the formula for interior points of a con- 
necting line segment (Proposition 3 of the preceding section) is used to 
define concavity. The general notion is illustrated in Figure 13b in which 
SS’ is the graph of a concave function y = f(z). A and B are any two 
points on the surface and C is any point on the connecting line segment, 
and since the curve is concave, C lies below point D on SS’ directly above 
Ze, the z coordinate of C. Now we have 


(i) the y coordinate of point D, ya = f(x.) = f[Ez. + (1 — E)ay] 
by the formula for z, from the preceding section; 

(ii) by Proposition 3 the y coordinate of point C = y, = kya + 
( — k)ys = kf(z;) + (1 — Ef). 


"Therefore, we have 


Definition: The function y = f(z) is strictly concave if for any two 
values of z, call them z, and Za, every point C = (y, x.) on the line 
connecting (Ya, Ta) and (ys, 2») lies below the corresponding point D — 
(ya, £e) on the graph of the function, i.e., [by (i) and (ii)] if 


y. = kfé.) + (1 — KF) < fikra + (1 — k)zy] = ya. 


Definition: The function y — f(z) is concave if for any two values of z, 
call them z, and 2s, every point C = (y,, ze) on the line connecting points 


?! The reader may well want to review the discussion and the distinction between the 
Concept of a conyex set (region) and that of a convex function, 


IE EE EE ZA 
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A = (ya, Ta) and B = (ys, xs) lies on or below the corresponding point 
D = (ya, z.) on the graph of the function, that is, if the < in the previous 
definition is replaced by <. 


These concepts are readily extended to the (n + 1)-variable case 
y = f(zi * 7, 25), where we merely write ya = f(z,,, - - *; Zna), etc., and 
leave all other elements of the definitions of concavity and strict concavity 
completely unchanged: 

Now strict concavity is the natural extension of the second-order 
maximum condition requiring a negative second derivative of the function 
being maximized. In effect, strict concavity requires that if the function is 
differentiable, its second derivatives be negative along any cross section, 
le. in any direction in the (n + 1)-dimensional (!) graph representing 
the function. Looked at another wa: ; obviously in seeking to maximize 
we want the relevant function to be shaped like a hill or an inverted cup, 
and that is just what we mean by (strict) concavity. 

Thus, if we are to analyze consumer behavior in terms of utility 
maximization, it would be convenient for the utility function to be concave, 
with the second-order conditions for maximization thereby satisfied. 
Unfortunately, ordinal utility analysis cannot accept such an assumption. 
For, as we have seen, given any indifference map, there is an infinity of 
utility functions corresponding to it, any one of which is as acceptable as 
any other. Furthermore, as we will confirm next, among those that are 
acceptable there will be some utility functions that are concave and some 
that are not. This is shown clearly by Figures 14a and 14b, both of which 
have the same indifference curves for combinations of z, and T2, yet the 
first of which has a utility surface that is concave and the latter of which 
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does not. The one utility surface is clearly a monotonic transformation of 
the other, and hence neither is more valid than the other as a representation 
of the indifference map. Thus, we simply cannot take an ordinalist point 
of view and yet require a utility function to be concave. Concavity of the 
utility function essentially has no implications about preferences in an 
ordinalist world because it is irrelevant for the shapes of the indifference 
curves. It is therefore just undefinable in any operational or observable 
terms. Some substitute concept has to be found by the ordinalist to serve 
the purposes of the second-order conditions. This substitute is a weaker 
condition, quasi-concavity, to which we now turn. 


19. Quasi-Concave Utility Functions 


Intuitively, a function is taken to be strictly quasi-concave if it follows 
one of two behavior patterns: (a) It is monotonic throughout; that is, in 
any direction in which its graph slopes uphill it does so “forever,” i.e., for 
all values of its variables, and in any direction in which it slopes downhill, 
it also never reverses direction; (b) alternatively, if the function is not 
monotonic, it will have one single maximum with no other bumps or dents. 
Where the second alternative holds we see that the quasi-concave function 
does, indeed, resemble a concave relationship, as is illustrated in Figure 
14a, But where the first alternative applies, as we will see (Figure 14b) 
that the shape of a quasi-concave surface may depart significantly from 
that of one that is concave. The formal definition of quasi-concavity bears 
some resemblance to that of concavity: 


Definition: A function y = f(z;,---,z,) is quasi-concave, if given 
any two sets of values’? of the z's x, = (z,4---,z4,) and xy — 
(215 +++, 253), where, say, 


F(x) < f(x), 
f(x.) > f(x) 


then 


for x, any point on the line segment connecting?? x, and xp. 


That is, the function is quasi-concave if given any two points A and B, on 
its surface, then the height of any intermediate point, C. , on a cross section 
through A and B is at least as great as the lower of the two points A and B. 
Similarly, we have 


22 Note that here we introduce the vector notation x = (2,---, z,) so that f(x) 
represents f(z;, - - -, za). 
23 [so that there exists a k value 0 < k <1 such that for any i = PT 


un 
Zie = kzio + (1 — k)za] 
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Definition: A function f(x) is strictly quasi-concave if for any two values 
Xa and x, and any point x. on the line segment connecting them, f(x.) is 
greater than at least one of f(x.) and f(x;), i.e., if either 


f(x) >f) or f(x.) f(x). 


Tt is easy to prove 


Proposition 4: Every function which satisfies the definition of concavity 
automatically satisfies that of quasi-concavity, and, similarly, every 
strictly concave function is automatically strictly quasi-concave. 

' Though we omit a formal proof, the reason for this result is not difficult 
to see. By definition, interpreting xa, x», and x, as before, a concave function 
is one for which f(x.) is at least as great as a weighted average of f(xa) 
and f(x»), i.e., for which f(x.) > kf(x;) + (1 — k)f(xs). But then f(x.) must 
obviously be at least as great as the smaller of the two items in thc average, 
which is what quasi-coneavity requires. 

But while every concave function is therefore automatically quasi- 
concave, the converse is not true. 


Proposition 5: A function which is quasi-concave need not be concave. 

That is just what we mean by saying that quasi-concavity is a weaker 
condition than concavity. There are many functions that are quasi- 
concave but not concave. Àn example is all that is needed to prove the 
proposition. The function y — z? is clearly not concave for its graph 
“goes” upward toward the right at an increasing rate somewhat like the 
surface in Figure 14b, and so the line segment connecting any two points 
on its graph lies above its graph, not below it as concavity requires. How- 
ever, y = x” is quasi-concave, for take any two values of x, ta < x». For 
any z. between them we may write Te = Ta + ô, 6 > 0. Then y(z.) = 
(ta + ô)? > z2 = y(z.), as is required for strict quasi-concavity. 

Thus we have shown that y = x? is a function that is (strictly) quasi- 
concave but not concave (or strictly concave). 

Next we show 


Proposition 6: A quasi-concave function cannot have two (local) 
maxima. 


Proof by contradiction: Suppose the contrary, that y = f(x) is quasi- 
concave and yet possesses two (separated) local maxima x, and x». By 
definition every local maximum point is surrounded by points of lower 
altitude; therefore there must be a point x, on the line segment joining 


x, and xs such that f(x.) < f(x;) and f(x.) < f(x). But this contradicts 
the premise that f(x) is quasi-concave. 
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This means that if a quasi-concave function has more than one maximum 
point they must be contiguous with the top of the graph , forming a level 
plateau.?* This completes our characterization of the quasi-concave 
functions themselves. They may have a maximum point, but for a function 
that is strictly quasi-concave, never more than one, and if they have none, 
they will be monotonic. They need not be concave and include utility 
surfaces like that in Figures 14b as well as that in Figure 14a. 

Next, we prove a proposition which shows that the quasi-concavity 
of utility functions is compatible with an ordinalist analysis, which, it 
will be recalled, treats two utility functions to be interchangeable if one 
is a monotonic transform of the other. 


Proposition 7: Any function y* = gly) = glf(z)] obtained by a mono- 
tonic transformation from a quasi-concave (strictly quasi-concave) 
function y = f(x) must itself also be quasi-concave (strictly quasi-concave). 


Proof: By definition of quasi-concavity with Xc, Xa, and x; defined as 
before, y. = f(x.) > Ya = f(x). Then by the monotonicity of the trans- 
formation y? = g(y.) > g(y.) = yt , which proves the quasi-concavity of 
y*, for it shows that y? must also equal or exceed the smaller of ys and yf. 

Thus, quasi-concavity is not a characteristic which evaporates as one 
utility function is replaced by another obtained from the former by a 
monotone transformation. 


Finally, we come to another proposition that reveals the reason for 
adoption of the premise of strict quasi-concavity for utility functions, 
for though this premise is weaker than that of strict concavity, it is never- 
theless sufficient, if used along with the premise of nonsatiety, to guarantee 
that indifference curves are convex to the origin. And once that property 
of indifference curves is satisfied all of the usual analysis of consumer 
behavior proceeds without difficulty.?5 Thus we conclude our discussion 
with 


?4 But if the function is strictly quasi-concave, even two such points with equal values 
of f(x) are impossible. Indeed, we have the following more general proposition: A func- 
tion that is strictly quasi-concave cannot have two global maxima x, and xp. 

Proof: If both points are global maxima, we must have f(x.) = f(x). Hence, if x. is 
any point on the line segment joining xa and xi, we must have by strict quasi-concavity 
f(x.) > f(xa) = f(x»). But this obviously contradicts the assertion that at, points x, and 
x, the function f(x) attains a global maximum. 


25 We also need for this purpose the negative slope of the indifference curves and the 
property that curves farther from the origin are always preferred, but these, as we have 
seen, follow from the nonsatiety assumption. 
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Proposition 8: If the consumer is not sated in any commodity (he 
prefers more of any good or any combination of goods, holding all other 
quantities constant) and his utility function is strictly quasi-concave, 
then any of his indifference curves (surfaces) will be convex to the origin. 


Geometric demonstration: Given a strictly quasi-concave utility function. 
(surface OSTU in Figure 15a), select any two points A and B of equal 


Figure 15 


height (utility) on its surface so tha 


floor of the diagram (the 1, 7? plane) lie on the same indifference curve. 
Now connect a and b by a li 


OR through the origin. To 
ference curve which connects indifferent points a and b we ask where the 


origin of the indifference curve through points a, e, and b 
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20, Elementary Mathematics of Demand Analysis 


As we have noted, much of standard demand analysis is based on the 
formulation which takes the consumer to maximize a utility function 


u = f@1,---, ta) 
subject to a budget constraint 
Dii + pote +++++ pam, = m, 
where z; is the quantity of commodity i purchased by the consumer, p; 
is its price, and m is the total amount of money available to him. Using 
2 Lagrangian approach to the problem (cf. Chapter 4, Section 8) we 
obtain the expression 
A. = f(z, ras , Zn) + A(m = Dii -F P2X2 me UNES Prin); 


which we maximize by setting each of its partial derivatives equal to zero: 


LE. J 


Oz, az, P150 
ow af 
B. darc rn 
Ou 
KC mg — o — pats = O. 


Here, by definition, f/àz; is the marginal utility of 7. Dividing the preced- 
ing equation corresponding to commodity 7 by the equation which refers to 
commodity j, we have 


Sj P: 
6) dz] ðr; py 


This is the equilibrium condition which was derived in a less formal manner 
earlier in the chapter. It states that in equilibrium the ratio of the prices of 
the two commodities must be equal to the ratio of their marginal utilities, 
i.e., to their marginal rate of substitution. 

Let us see finally how the utility function can be used to derive a 
specific demand relationship. 
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Example: Suppose 8 consumer has $90 available to be divided between com- 
modities A and B, and suppose the unit price of B is fixed at 20,cents. What will 
be his demand equation for A if his utility function is v = log Ta + 2 log zi? 


Answer: We are given ps = 0.2 and m = 90. By direct differentiation 
of the utility function we obtain du/dz, = l/z, and du/dzs = 2/z». 
Substituting these into equilibrium condition (5) we have 
1j _ Be 
a ze 0.2 


or 
zy = 10po%a. 


Substitution of this value of zi, p» = 0.2 and m = 90 into the budget 
M = Pata + por yields 


90 = pata + 0.2(10poz;) = pata + 2pata 
or 


Pot, = 30. 


This is our desired demand equation, which, incidentally, happens to be a 
Tectangular hyperbola. 


To summarize, given a utility function, income, and the prices of all 
other commodities, we obtain the demand for the remaining commodity 


by direct substitution into the equilibrium condition (5) and the budget 
Constraint. 


PROBLEMS 


Find the demand function for commodity A, given 
l. p = $3, m = $20, u = 4TaTb. 
2. ps = $12, m = $246, u = el aTa. 


3- pe = $8, m = $100, u = 27, — 322+ zp- 452. E 782. 


REFERENCES 


SR posed: rag ees George J. Stigler and Kenneth Ei Boulding 
s.), Readings in Price The Ri f i 5 ; 
1959, y rot ^ ory, Richard D. Irwin, Inc., Homewood, Ill., 
Henderson, James M., and Richard E. Quandt, Microec 


edition, McGraw-Hill Book Company, New York 
Sections 1-7. i 


onomic Theory, 3rd 
1971, Chapter 2, 


226 Demand Curves, Utility Surfaces, and Indifference Maps Chapter 9 


Hicks, J. R., Value and Capital, 2nd edition, Oxford University Press, New York, 
1946, Part I and Mathematical Appendix to these chapters. (Rather difficult 
reading.) ` 

, A Revision of Demand Theory, Oxford University Press, New York, 1956. 

Malinvaud, Edmond, Lectures on Microeconomic Theory, North-Holland Publish- 
ing Company, Amsterdam, 1972, Chapter 2 (rather difficult). 

Marshall, Alfred, Principles of Economics, 8th edition, Macmillan & Co., Ltd., 
London, 1922, Book III and pp. 838-840. 

Morgenstern, Oskar, “Demand Theory Reconsidered,” Quarterly Journal of 
Economics, Vol. XLII, February 1948. 

Samuelson, Paul A., Foundations of Economic Analysis, Harvard University 
Press, Cambridge, Mass., 1947, Chapters V and VI. (Highly mathematical.) 

Walsh, V. C., Introduction to Contemporary Microeconomics, McGraw-Hill Book 
Company, New York, 1970, Chapters 1 and 3 {axiomatic treatment of 
utility theory). 


On Empirical Determination 
of 


Demand Relationships 


l0 


1. Why Demand Functions? 


Demand functions, as they are defined in economic analysis, are rather 
queer creatures, somewhat abstract, containing generous elements of the 
hypothetical and, in general, marked by an aura of unreality. The pecu- 
liarity of the concept is well illustrated by the fact that only one point on & 
demand curve can ever be observed directly with any degree of confidence, 
because by the time we can obtain the data with which to plot a second 
point, the entire curve may well have shifted without our knowing it. A 
more fundamental but related source of our discomfurt with the idea is 
the fact that the demand relationship is defined as the answer to the set 
of hypothetical questions which begin, “What would consumers do if 
price (or advertising outlay, or some other type of marketing effort) were 
different than it is in fact?” We are, then, dealing with information about 
potential consumer behavior in situations which consumers may never 
have experienced. And, since we have very little confidence in the con- 
stancy of consumer tastes and desires, all of these data are taken to refer 
to possible events at just one moment of time—e.g., consumer reactions 
to alternative possible prices if any of them were to occur tomorrow at 
2:47 P.M. 

In view of all this, there should be little wonder that people with an 
Orientation toward applied economics occasionally become somewhat im- 
Patient with the economic theorist’s demand function. Yet no matter how 
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ingenious the circumlocutions which may have been employed, they have 
been unable to find an acceptable substitute for the concept. For the 
demand function must ultimately play a critical role in any probing 
marketing decision process, and there is really no way to get away from it. 

For example, to decide on the number of salesmen which will best 
serve the interests of the firm, it is first necessary to know what difference 
in consumer purchases would result from alternative sales force sizes. But 
this is precisely the sort of odd and hypothetical information which goes 
to make up the demand relationship. It is for exactly the same reason that 
many large and reputable firms in diverse fields of industry are conducting 
ambitious research programs whose aim is the determination of their 
advertising-demand curves, that is, the relationship between their ad- 
vertising outlays and their sales. So far, these efforts have met with varying 
degrees of success, and it must be admitted that many of them have not 
come up with very meaningful results. For the empirical determination of 
demand relationships is no simple matter and there are many booby traps 
for the amateur investigator and the unwary. It is no'trick at all, on looking 
over a small sample of the published demand studies, to come up with 
horrible examples of just about every available type of misstep. 

This chapter is designed primarily to point out some of the pitfalls 
which threaten the investigator of demand relationships. Its aim is to warn 
the reader to proceed with extreme caution in any such enterprise. No cut- 
and-dried solutions are offered to the problems which are discussed. This 
is true for two reasons: first, because many of the methods for dealing 
with these difficulties are highly technical matters of specialized econometric 
analysis and so are completely outside the scope of this volume. Second, 
and more important, solutions are not listed mechanically because there 
simply are no panaceas; the problems must be dealt with case by case as 
they arise, and the effectiveness with which they can be handled is still 
highly dependent on the skill, experience, and judgment of the specialist 
investigator. 

If after reading the chapter the reader is left somewhat worried and 
uncomfortable, it will have accomplished its purpose. However, it should 
be emphasized that the problems which are raised, serious and difficult 
though they be, are not totally intractable and beyond the power of our 
statistical techniques. Mh ec 


2. Interview Approaches to Demand Determination 


Before turning to statistical methods for the finding of demand func- 
tions, it is appropriate to say a few words about a more direct method for 
dealing with the problem—the consumer interview approach. In its most 


Part 2 On Empirical Determination of Demand. Relationships 229 


blatant and naive form; consumers are simply collared by the interviewer 
and asked how much they would be willing to purchase of a given product : 
at & number of alternative product price levels. 1 

It should be obvious enough that this is a dangerous and unreliable 
procedure. People just have not thought out in advance what they would 
do in these hypothetical situations, and their snap judgments thrown up 
at‘ the request of the interviewer cannot inspire a great deal of confidence. 
Even if they attempt to offer honest answers, even if they had thought 
about their decisions in advance, consumers might well find that when 
confronted with the harsh realities of the concrete situation, they behave 
jn a manner which belies their own expectations. When we get to the effects 
of advertising on demand, the problems of such a direct interview approach 
become even more apparent. What is the consumer to be asked—how much 
more of the company's product he would buy if it were to institute a 10 
per cent increase in its spot announcements to its television budget? 

Much more subtle and effective approaches to consume. interviewing 
are indeed possible. Indirect, but far more revealing, questions can be 
asked. Consumers may, for example, be asked about the difference in price 
between two competing products, and if it turns out that they simply do 
not know the facts of the matter, one may be led to infer that a lower 
product price may have a relatively limited influence on consumer behavior, 
just because few consumers are likely to be aware of its-existence. A clever 
interview-designer may in this way build up a strategy of indirect questions 
which gradually isolates the required facts. 

Alternatively, consumers may be placed in simulated market situations, 
so-called consumer clinics, in which changes in their behavior can be ob- 
served as the circumstances of the experiment are varied. An obvious ap- 
proach to this matter is to get groups of housewives together, give them 
small amounts of money with which they are offered the opportunity to 

_ purchase one of, say, several brands of dishwasher soap which are put on 
‘display at the clinic, and observe what happens as the posted prices on the 
displays are varied from group to group. Here again, much more subtle 
variants in experimental design are clearly possible. 

But even the best of these procedures has its limitations for our purpose, 
which is the determination of the precise form of a demand relationship. 
Artificial consumer clinic experiments inevitably introduce some degree of 
distortion because subjects cannot be kept from realizing that they are in 
an experimental situation. In any event, such clinics are rather expensive 
and so the samples involved are usually extremely small—too small for 
confidence in any inferences which are drawn about the magnitudes of the 
parameters of the demand relationships for the body of consumers as a 
whole. And large sample interviews which approach the determination of 
consumer demand patterns by subtle and indirect questions are often 


230 On Empirical Determination of Demand Relationships . Chapter 10 


highly revealing, but they rarely can supply the quantitative information 
required for the estimation of a demand equation. 


3. Direct Market Experiments 


A second alternative approach which is sometimes considered as a 
means for finding demand relationship information is the direct market, 
experiment. A company engages in a deliberate program of price or ad- 
vertising level variation. Suppose it increases its newspaper advertising 
outlay in one city by 5 per cent, in another city it increases this outlay by 
10 per cent, and in still a third metropolis a 10 per cent reduction is under- 
taken. In some ways such a direct experimental approach must always be 
the most revealing. It gives real answers to our formerly hypothetical ques- 
tions and does so without subjecting the consumer to the artificial atmos- 
phere of the interview situation or the consumer clinic. 

However, direct experimentation has its serious limitations as well. 

1. It can be very expensive or extremely risky for the firm. Customers 
lost by an experimental price increase may never be regained from com- 
petitive products which they might otherwise never have tried, and a 10 
per cent increase in advertising outlay for any protracted period may be 
no trivial matter. 

2. Market experiments are almost never controlled experiments, so 
that the observations which they yield are likely to be colored by all sorts 
of fortuitous occurrences—coincidental changes in consumer incomes or 
in competitive advertising programs, peculiarities of the weather during 
the period of the experiment, etc. 

3. Because of the high cost of the experiments and because it is often 
simply physically impossible to try out a large number of variations, the 
number of observations is likely to be unsatisfactorily small. If, for example, 
it is desired to determine the effects of varied advertising outlay in a na- 
tional periodical, the company cannot increase the size of its ads which are 
seen by Nashville readers and simultaneously reduce those which are seen 
in Lexington, Kentucky. This difficulty has been eased to some extent by 
the fact that a number of national magazines now put out several regional 
editions, but by and large the problem remains: Market experiments usually 
supply information only about a very limited number of alternatives. 

4. For similar reasons, market experiments are often of only relatively 
brief duration. Companies cannot afford to permit them to run long enough 
to display much more than impact effects. And yet the distinction between 
impact effects and long-run effects of a change is often extremely significant, 
as was so clearly demonstrated by the sharp but very temporary drop in 
cigarette sales when the first announcement was made about the association 


between smoking and the incidence of cancer. How often has a rise in the 
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price of a product caused a major reduction in purchases for a few weeks, 
with customers then gradually but steadily drifting back? 

Market experiments do have a role to play in demand relationship 
determination. They can be important as a check on the results of a sta- 
tistical study. Or they can provide some critical information about a few 
points on the demand curve in which past experience is entirely lacking. 
In some special circumstances experimentation is particularly convenient 
and has been used in the past, apparently with a considerable degree of 
success. For example, some mail-order houses have employed systematic 
programs in which a few special experimental pages were bound incon- 
spicuously into the catalogues distributed to customers within restricted 
geographic regions, thus permitting observation of the effects of price, 
product, or even catalogue display variations. However, it should also be 
clear that market experiments cannot by themselves be relied upon uni- 
versally to provide the demand information needed by management. 
Economics is just not a subject which lends itself readily to experimenta- 
tion, largely because there are always too many elements beyond the con- 
trol of the investigator and because economic experimentation is often 
inherently too expensive, risky, and difficult. 


4. Standard Statistical Approaches 


The third, and generally most attractive, approach to demand function 
determination attempts to squeeze its information out of sources such as 
the accumulated records of the past (a time-series analysis), or a com- 
parative evaluation of the performance of different sectors of the market 
(a cross-sectional analysis). The available statistics on sales, prices, ad- 
vertising outlays of the most relevant varieties, and other marketing data 
are gathered together and then analyzed with the aid of the standard 
statistical techniques. j 

The basic procedure is simple enough; in fact, as we shall see presently, 
it is often far too simple, particularly in the case of advertising-sales relation- 
ships where, for reasons we will see presently, near-perfect correlations, 
which are in fact spurious, are very common. Suppose, for example, that the 
following data on company sales and advertising outlays have been 
accumulated: 


TABLE 1 
Year 1950 1951 1952 1953 1954 1955 1956 1957 
n i cdtrees ciem etm 67 73 54 62 70 75 79 83 
(millions of dollars) 
Advertising.............-++-+ 12 15 13 14 18 17 19 15 
(millions of dollars) 


 —_— ovV01§1( aaaaammŘħŘħŮĖŐÁ 
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Once the figures have been plotted, the pattern formed by the dots can 
be used in an obvious manner to fit a straight line (see Figure 1) or a 
curve to them. This line is then taken as the desired advertising-demand 
curve. Its slope can be used as a measure of advertising effectiveness, that 
is, it measures the marginal sales productivity of an advertising dollar, 
A sales/A advertising outlay. This line can be determined impression- 
istically simply by drawing in a line that appears to fit the dots fairly well, 
or any one of a variety of more systematic methods can be used. 

The most widely employed and best known, of these techniques is the 
method of least squares,! in which the object is to find that line which 
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makes the sum of the (squared) vertical deviations between our dots and 
the fitted line as small as possible, where the deviations are defined as the 
vertical distances such as AB or CD in Figure 1. The idea is inherently 
attractive. We wish'to minimize deviations because a line which involves 
very substantial deviations from the dots representing our data surely 
does not represent the information in & very satisfactory way. But if, in 
our addition process, & large negative deviation such as AB (that is, & 
case where the line underestimates the vertical coordinate of our dot) 
happens to be largely cancelled out by a positive deviation, CD, the sum 
of the deviations can turn out to be small. This is surely not whàt we want 
in looking for a line which does not deviate much from the dots. One can 
avoid ending up with a line which fits the facts rather badly but in which 


1'The next few paragraphs are a very elementary review of the method of least 
squares and they should be omitted by the reader who has any acquaintance with the 
subject. 
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the positive and negative deviations add up to a rather small number, by 
squaring all the deviation figures before adding them together. Since the 
square of a negative real number as well as that of a positive real number is 
always positive, large, squared negative deviations cannot offset large, 
squared positive deviations, and the sum of squared deviations will never 
add up to a small number unless our line happens to fit the dots closely? 
There exist still more sophisticated techniques for fitting our advertis- 
ing-demand curve from the data. Although it is often too complex and 


2 Other devices (such as the absolute value of the fourth power of the deviation) might 
accomplish the objective discussed. The reason one chooses to minimize the sum of the 
squares is that under very simple assumptions such estimates have several extremely 
desirable technical properties, among them, that these estimated parameter values are 
“best” in the sense that they minimize variance of the estimate and are unbiased in the 
sense discussed in the appendix to this chapter. 

To find the straight-line equation which satisfies our least squares requirement we 
employ the symbol y, to represent sales in year ¢ and z, to represent advertising outlay 
in that year and let the equation of the line to be fitted be written ye, = a+ bz; where 
the subscript c in yc; is there to remind us that in our equation the y is a figure calculated 
from the formula rather than observation. Now we proceed as follows: 


Step 1: Define a deviation from our line as 
ye — Ya = ye — (a + bz) = ye — a — bz 
Step 2: Define a squared deviation as 
(yt — ye? = yi + at + bzi — 2ay, — Wry + 2abzs. 
Step 3: Add the squared deviations 
E (i — ya) = Lyi + na +t E z} — 2a È y. — 2b L zy + 2ab Ln, 


n 
where, since a is a constant, VLeaaeg +a +a .---(n equal terms) = na*. 
tel 
Step 4: Find the values of a and b (the parameters of our equation) which minimize 
the sum of the squared deviations. We do this with the aid of the usual calculus procedure, 
by taking partial derivatives with respect to a and b and setting them equal to zero, thus: 


— t 
aE WW n-2 Dy + EnO 


and 


a È (ye — Ye)? 
a 


n =% 3,21-- 2 Dem + 2a Va = 0. 


These last two equations contain, in addition to a and b, only known statistical figures 
zı and yı. The equations can therefore be solved simultaneously to obtain the desired 
parameter values, a and b, i.e., they determine for us the least squares line yee = a + bze 
These two equations are usually referred to as the normal equations of the least squares 
method in this most elementary (two-variable straight-line) case. The procedure em- 
ployed in fitting many variable equations or curvilinear equations is a simple and 
obvious extension of that which has just been described. 
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expensiveito employ in practice, professional statisticians usually consider 
the method of maximum likelihood as their ideal. This method requires 
some information about the:probability distribution of the random elements 
which! influence sales. From this probability distribution the statistician 
determines a likelihood function 


ees L = fle, y» a, b), 


where z; and y, represent, respectively, advertising expenditures and sales 
in year ¢, and a and b are the constants in our advertising-demand equation 
yr =a + brs. This likelihood function is defined as an answer to the follow- 
ing type of question: “Given any specific values of the parameters in our 
equation, say a = 5 and b = 63, how likely is it that the demand situation 
would have generated the statistics 71959 = 12, y1950 = 67, ete.?” (Note 
that these values of sales and advertising are in fact our observed statistical 
figures taken from Table 1.) Considering all possible values of a and b, we 
can then employ the differential calculus to find the a and b combination 
which maximizes the value of the likelihood, L. We will then have found 
the a and b which provide, in this sense, the best possible explanation of the 
observed facts, i.e., we will have found that equation y; = a + bx, whose 
parameters a and b ate most likely to be the correct values of the true but 
unknown parameters, given the facts which were actually observed by the 
data-collector. 

It is of interest to note that in some special cases the least squares 
method turns out to be identical with maximum likelihood. That is, in 
these fortunate circumstances the least squares calculation becomes 
equivalerit-to the maximum likelihood procedures We shall presently discuss 
one of the things which may go wrong if the least squares method is em- 
ployed in Situations where it does not yield the same results as the maximum 
likelihood, calculation. 

aying, described now in highly general and impressionistic terms the 
methodi, which. are.most: commonly employed by. the statistician to deter- 
mine relationships, let us now see some of the problems to which they 
give rise. mS A 


5. Omission of impone Variables 


Clearly, sales are affected by other variables in addition to the com- 
pany "s advertising expenditure. Prices, competitive advertising, consumer 
income variations, and other variables also play an. important role in any 
demand.relationship. If, therefore, we try to extract from our statistics a 
simple equation relating sales to advertising outlay alone, and in the process 
Wé' ignore all other variables, our results are likely to be very badly dis- 
torted. We may ascribe to the company’s advertising outlays sales trends 
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which are really the result of the behavior of other economic changes. The 
behavior of other variables can thus conceal and even offset the effects of 
advertising. To show how serious the results can be, consider the illustrative 
demand equation 


(1) S= - 50 +4A+ 0.02Y, 


where S represents sales, 4 advertising" expenditure, ‘and’ E consume 
income. The values given ir Table 2 ean easily be seen to satisfy the equa- 
tion precisely, and'any standard estimation procedure based on süch i ans 
—— can — pe to yield the correct ta W 1909 


TABLE, 2 
Date =! 11956 1957 s 1958 
y 3,000 4,000 3,500 
A ing 3 f 2.5 


S. 1 118 142 130 


But the standard ealoaldtic &hówà that a two-variable,’ straight, least 
squares line whiéh gives us/a (perfect!) correlation between \S’and"A aloné 
(ignoring: w and which is Pased on these same values will yield thie sien de 


(2) iesu cai eed v (Ty 


This der em ‘asserts ‘that each added dollar of advertising emnendituee 
brings in $24 in sales, instead of the true $4 return shown by Equation (1). 
In addition, because of the perfect correlation there is, in this case, no 
residual unexplained variation ins Savhieh isdeftto De-accóunted'fór:by:a 
subsequent correlation between S and Y, i.e., this incorrect procedure 
appears to show that consumer income has absolutely no infliience on’ ‘demand! ! 
The advertising ‘coefficient has bééii inflated. A E hi isa Eos 
influence of Y on sales. M ivolo1 odi to ssdi 
‘Incidentally, if;'instead of ‘proceeding’ as’ We" an ta ‘We’ Had’ ‘ae 
off by finding a least Squares equatión' relatitig iles tó Cot 
alone, we ‘would have piscine from! the’ sime’s 


' 0i wn Sb 


S = 0.024¥ +46, 


mic im 8 Seusosd bas aetblido tisdt oi 
which this time overvalues the influence of i income on, sales;and,ascribes 
absolutely, no effectiveness, to adyertising.3i: syt od eestodjasyar var d 
It, is, clear, then, that more than, two, variables, must usually, ibe, taken 
into account. in the statistical estimation tof a demand, ‘relationship...And; 
in fact, this, is ordinarily idone, the.estimation, usually, employing. what is 
called a. least squares multiple; regression technine However, it should;he 


7 


dawdw bodtom [aoidaidela Jsveney ai dug 


3 The ose tean Y and, Aj creates; another anli in this, examples. The 
resulting problems are discussed i in a the next, Section. ri oldaitav saodi to on E 
32. i eoldanav mi fO 19v9 
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remembered that even if we include five variables in our analysis but omit a 
sixth rather important variable, precisely the same difficulties will be en- 
countered. That is, the omission of any important variable, however 
defined, from the statistical procedure can lead to serious distortions in its 
results. 

This might appear to constitute an argument for the inclusion in the 
analysis of every variable which comes to the statistician’s mind as a 
factor of possible importance, just as a matter of insurance. Unfortunately, 
however, we are not at liberty to go on adding variables willy-nilly. The 
more variables whose influence we want to take into account, the more 
data we require as a basis for the estimation. If we only have statistical 
information pertaining to three points in time, it is ridiculous to try to dis- 
entangle the influence of fifteen variables. In fact, the statistician requires 
many pieces of information for every variable he includes in his analysis, 
if he is to estimate his relationship with a clear conscience. 

However, large masses of marketing data are not easily come. by. 
Records are often woefully incomplete; additional data can sometimes be 
acquired only at considerable expense, and in any event, statistics which 
go too far back in time are apt to be obsolete and irrelevant for the com- 
pany's current circumstances. We must, therefore, very frequently be 
contented with skimpy figures which force us to be extremely niggardly 
in the number of variables which we take into account, despite the very 
great dangers involved. 


6. Inclusion of Mutually Correlated Variables 


Another difficulty which, to some extent, can help to make life easier 
as far as the problem of the preceding section is concerned arises when & 
number of the relevant variables are themselves closely interrelated. For 
example, one encounters advertising effectiveness studies in which income 
and years of education per inhabitant are both included as variables. Now 
education is itself very closely related to income level both because higher- 
income families can afford to provide more education and larger inheritances 
to their children and because a more educated person is often in a position 
to earn a higher income. 

It may nevertheless be true that education and income do have different 
consequences for advertising effectiveness. For example, an increase in 
income without any change in educational level could increase the person’s 
willingness to purchase more in response to an ad, whereas more education 
not backed up by larger purchasing power might have the reverse effect. 
But, in general, there is no statistical method whereby these two conse- 
quences can be separated, because, for the bulk of the population, when- 
ever one of these variables increases in value, so does the other. Hence, 
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the statistics which can merely exhibit directions of variation might show 
that, other things remaining equal, whenever sales increased, income also 
increased, and so (as a consequence?) did education. 

In such circumstances if we include both the income and the educational 
level variables in the statistical demand-fitting procedure, the chances are 
that the mechanics of the procedure will provide a perfectly arbitrary 
ascription of the sales changes to our two causal variables. And sometimes 
the results may turn out completely nonsensical because the standard com- 
putational procedure has no way to apply common sense in imputing the 
total sales change to the separate influences of education and income 
changes. 

Therefore, if in a demand relationship there occur several variables 
which are themselves highly correlated, it is usually wise to omit all but 
one of any such set of variables in a statistical study. If this is not done, 
another powerful source of nonsense results is introduced. 


7. Simultaneous Relationship Problems 


The difficulties which have so far been discussed, while they can be 
extremely important and are often overlooked in practice (with rather sad 
consequences) may, by and large, be considered rather routine and, in 
retrospect, fairly obvious matters. 

We come now to a far more subtle and perhaps a far more serious 
problem which was only brought to our attention in 1927 by E. J. Working 
and which has only received serious and systematic attention quite re- 
cently, largely as a result of the work of the Cowles Foundation. The 
problem in question, in a sense, follows from the difficulty which was dis- 
cussed in the previous section. If there is a close correlatiun between two 
variables, it is likely to mean that they are not independent of one another 
and that there is at least one other relevant equation in the system which 
expresses the relationship between them. For example, in our illustrative 
case there might be an equation indicating how income level is ordinarily 
increased by a person’s education. We then end up having to deal with 
not just a single demand equation, but with a system of several equations 
in which a number of the variables interact mutually and are determined 
simultaneously. 

Economics is characterized by such simultaneous relationships. The 
standard example is the price determination process in which a supply 
equation is involved as well as our demand relationship. Similarly, simul- 
taneous relationships constitute the core of national income analysis. 
National income depends on the demand for consumer's goods which helps 
determine the level of profitable production. But the consumption demand 
equation, in turn, involves national income (as a measure of the public’s 
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‘purchasing power) as a variable. To mention another simultaneous rela- 
tionship example, the coal mining industry is a customer for steel whose 
volume of demand depends on coal sales, but the demand for coal itself 
depends heavily on the amount of coal to be used in producing steel. It is 
possible to expand the list of simultaneous relationships in economics 
“indefinitely. 

The empirical data which are generated by such a set of equations are 
the information source on which the statistician must base his estimates 
of the relationships. But since these data are the result of a number of such 
relationships, the difficult problem arises of separating out the relationships 
from the observed statistics. 

Unless steps are taken to make sure that the influences of the several 
simultaneous relationships on the data can be and have been separated, 
there is not the slightest justification for the use of any estimation pro- 
cedure, such as that depicted in Figure 1, to compute a statistical relation- 
ship. Yet it will readily be recognized how frequently this completely 
fallacious procedure is employed in practice in the form of simple or multiple 
correlations computed without any attempt to cope with the simultaneous 
relationship problem. Let us see now how serious are the distortions which 
can be expected to result. 


8. The Identification Problem 


In rather general terms our basic problem can conveniently be divided 
into two parts: 


1. In some circumstances the simultaneous relationships (equations) 
will be so similar in character that it will be impossible. to: unscramble 
them (or at least. some of them) from the statistics. Such relationships are 
said to be unidentifiable, Presently it will be shown how such an unhappy 
situation can arise, and it will be indicated that it is unfortunately not un- 
heard of in marketing problems. Clearly, in such a case, we are wasting 
our time in a statistical investigation of the equation in question. There 
do exist some mathematical tests which show whether or not an equation 
is identified. (i.e., whether or not it is in: principle possible to separate, it 
from the other relationships in the system). These tests should always. be 
applied before embarking on. the. type. of statistical investigation. under 
discussion. It must be emphasized that, if an. equation happens not: to. be 
identified, it is impossible.even to approximate the true, equation from 
statistical data, alone. Market experiments or other substitute approaches 
must be employed to obtain this information. 

. 2. Even if an equation turns outi to be identified, precautions must be 
taken to. ensure that a statistically estimated equation is not distorted by 
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the presence of the simultaneous relationships. We will see in the next 
section that an ordinary least squares s próce tuse is likely to lead to doct 
this sort of distortion. 23J 


In this section we deal with the first of these, the identification problem 
—the circumstances under which it is, ‘at’ least-in principle, possible to 
unscramble our simultaneous relationships statistically. 

To illustrate, let us consider what is involved in finding statistically an 
advertising-demand curve such as the one which Figure 1 attempted to 
construct in a rather primitive fashion. Now while sales are doubtless 
affected by advertising, as the advertising-demand function assumes, 
this function is often accompanied by a second relationship in which what 
we might call the direction of causation is reversed. It is well known that a 
firm's advertising budget is frequently affected by its sales volume. In 
fact, many businesses operate on a rule of thumb which allocates to adver- 
tising expenditure a fixed proportion of their total revenues. For such a 
business, then, we will have two advertising expenditure demand relation- 
ships: (1) the demand function which shows how quantity demanded, Q, 
is affected by a firm's advertising budget, A: Q = f(A), and (2) the budget- 
ing equation which shows how the firm's advertising decisions are affected 
by the demand for its product: A = g(Q). 

Both of these relationships may actually be of interest to the business- 
man. The first, as already stated, is directly relevant to his own optimal 
expenditure decision. The second, if obtained from industry records, will 
give him vital information about the behavior patterns of his competitors. 

The firm’s actual sales and its actual advertising expenditure will, of 
course, depend on both its advertising budgeting practices (the budgeting 
equation) and on the demand-advertising relationships. In Figure 2 the 
graphs of two such hypothetical relationships are depicted. 

In Figure 2a we show the two curves which the statistician is seeking. 
We make ourselves, as it were, momentarily omniscient and thus have no 
difficulty envisioning the true relationships. However, the information 
available to the statistician is much more restricted as we shall now see. 
In our situation the actual advertising expenditure, A, and the volume of 
sales, Q, are determined, as for any simultaneous eauation, by the point of 
intersection, P, of the two curves. 

We now can describe two cases of nonidentification: 


Case 1. Neither Curve Identified. Pu 


If the two curves were to retain their shape from year to year, that i is, 
if neither of them ever shifted, all the intersection points P would coincide or 
at least lie very close together (Figure 2b). There would only be a single 
observed point, as in the figure, or the tightly clustered points would form 
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Figure 2 


no discernible pattern, and so the shape of neither curve could even 
approximately be found from the data. We see then, though it may be a bit 
surprising, that curves which never shift are from this point of view the 
worst of all possibilities. 


Case 2. One of the Curves Not Identified (but the other curve identifiable). 


This is a case frequently encountered in practice when the demand curve 
of one firm is investigated. The data form a neat and simple pattern, but 
what they describe is the firm’s inflexible advertising budgeting practices 
rather than the nature of the demand for its product. In such circum- 
stances what happens is that the budget curve never shifts but the demand 
curve does. There will then be a number of different intersection points, 
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such.as P, P’, and P", but they will always describe only the shape of the 
advertising budget line (Figure 3). The reader can well imagine how often 
statistical attempts to find the advertising-demand curve have produced neat 
linear relationships (and spectacularly high correlation coefficients), though 
what the triumphant investigator has located (without his knowing it) isa — 
totally different curve from the one he was seeking. The situation which we 
have just examined is really ideal from the point of view of the statistician, 
provided the relationship which is not shifting happens to be the one he is 
seeking. But the question remains: How is he to know when one relationship 
is standing still, and even if he somehow knows this, how does he determine 
which one it is? We will see that in the answers to these questions lies the 
key to the solution of the identification problem. 


It will be shown presently that only where both curves shift over time 
or from firm to firm or from geographical territory to territory can they 
ordinarily both be identified. However, in this case the difficult task of 
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Figure 4 


unscrambling the two relationships becomes particularly acute. Figure 4 
illustrates how three points, A, B, and C, in a diagram similar to Figure 1 
might have been generated by three different (shifted) pairs of our curves. 
It is noteworthy that the negatively sloping (!) "advertising curve" FF' 
estimated statistically from these points bears not the slightest resemblance 
to any of the true curves. Nor, since it is merely a recording of points of 
intersection, is there any reason why it should. The shape of FF’ is not even 
any sort o; “compromise” between those of the budget and advertising-demand 
curves! We conclude that where simultaneous relationships are present the 
standard curve-fitting techniques described in Section 4 and Figure 1 
may well break down completely. Their results are likely to bear absolutely 
no resemblance to the equations which are being sought! Such a naive approach 
may therefore well be worse than no investigation because misleading 
information is usually worse than no information at all. 
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Let us now see how one can, in principle, test whether the relationship 
' we are seeking is identified (potentially discoverable by statistical means). 
First we note that, às the model has so far been described, there is no 
way of accounting for any shifts in either relationship, which, as we have 
observed, are crucial for our problem. The reason is that only two variables, 
A and Q, have been considered in the relationships Q = f(A) (the demand 
relationship) and A — g(Q) (the advertising budget equation). 
There must, in fact, be some other influences (other variables) which 
disturb the relationships between Q and 4 and produce the shifts in their 
. graphs. These additional variables must be taken explicitly into account. 
As we know, the demand relationship is likely to involve many variables 
in addition to A. For example, consumer’s disposable income is a variable 
which affects the volume of sales resulting from a given level of advertising 
expenditure, though, very likely, it does not enter the firm's budget calcula- 
tion explicitly but only. indirectly via the effects of income on the sales of 
the company’s product. Similarly, the firm’s budget policy may be affected 
by its past dividend payments, which determine how much it can currently 
spare for advertising expenditure, but this dividend policy will have little 
or no effect on the demand curve for its products. Suppose, for the sake of 
simplicity, that the four vai -` `es Q, A, Y (the disposable income), and 
D (the total dividend payments in the preceding year) are the only ones 
that are relevant to the problem. Our two relationships then become 


(3) the advertising demand function Q = f(A, Y) 
and 
(4) the advertising budget equation A — 9(Q, D). 


Here changes in the value of Y are what produce the shifts in the graph of 
the demand equation which have been discussed. Similarly, changes in D 
produce shifts in the advertising budget curve. 

Now that we have examined how shifts in the two curves are produced 
we can return to the question of identification. Let us see, intuitively, how 
the presence of the shift variables in Equations (3) and (4) makes it possible, 
in principle, to separate the relationships from the statistics (i.e., how 
the shift variables identify the equations). It will be shown now that 
Y and D permit the statistician, at least conceptually, to divide up the 
statistical information in such a way that he is left with situations like 
that depicted in Figure 3. Such a situation gives him the information that 
permits him to infer which of the relationships is shifting and which is 
standing still. That is, he can determine when one graph is not moving 
while the other shifts around, so that the resulting dots trace out the graph 
of the equation which is not shifting, the equation he is trying to estimate. 
The reader should first be warned, however, that the procedure which is 
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about to be described is not usually a practical estimation (curve-finding) 
procedure and that other, more sophisticated measures are normally 
employed for the purpose. 

In Figure 5 we replot the data of Figure 1. Let us, in addition, determine 
for each point the dividend payment, D, for that particular year. Suppose 
this information is as shown in Table 3 (the corresponding sales and 
advertising figures are in Table 1). 


TABLE 3 


Advertising demand point 1950 1951 1952 1953 1954 1955 1956 1957 
Total dividend D ($ millions) 360 297 295 307 428 381 420 300 


We note that the dividend values for the points representing 1951, 1952, 
1953, and 1957 are fairly close together. Hence, if we are convinced that Y 
is the only variable which makes for sizable shifts in the advertising budget 
curve, it is reasonable to assume that all four points lie on (or close to) the 
same curve; that is, among these points there has occurred little or no shift 
in the curve. We may, therefore, use these four points (ignoring the others) 
to locate a budget curve UU’ (for income level approximately 300 billion) 
as shown. Similarly, we can use points for years 1954 and 1956 alone to find 
the shape of the advertising budget curve VV’ which pertains to income 
level approximately 420 billion, étc. In other words, the additional informa- 
tion on the value of D for each point has permitted us, in principle, to 
ignore all points which contain information irrelevant to a given advertising 
budget curve. 

We see, then, that if variable D is present in the one equation but not in 
the other it permits us, in principle, to discover statistical points over which 
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the budget line has. shifted but through which the demand curve remains 
unchanged, thus enabling us to trace out the corresponding demand curve. 

Inan analogous way we were able to trace out a budget line in Figure 3, 
for there the position of the demand curve changed as Y varied, while the 
budget line remained stationary. But while this enabled us to find the 
budget line in Figure 3, there the demand curve was unidentifiable because 
the budget curve never shifted. There is no shift variable such as D in the 
budget relationship postulated at that point which moves the budget line 
about and yet permits the demand curve to stay still. This gives us the 
following result: One of a pair of simultaneous relationships will be identified 
af it lacks a variable which is present in the other relationship. A change in the 
value of that variable will not affect the position of the curve corresponding 
to the relationship we are seeking, but it will shift the other curve. 

The relevance of the shift variables Y and D for identification can also 
be seen in another way. Assume that on the basis of a prior? judgment we 
have already constructed our model consisting of Equations (3) and (4) in 
which we postulate in advance that the variable Y is present only in the 
first of these equations and the variable D appears only in the second. 
Suppose now that we use any simultaneous equation estimation procedure . 
to find some statistical relationships among the variables Q, A, Y, and D. 
The system is identified if it is possible, in principle, to obtain one such 
statistical relationship which is known to be an approximation to Equation 
(3) (the demand function) and another statistical function which approxi- 
mates (4) (the budget function), and if it is possible to find out whether 
any given statistical curve derived in the process represents (3), (4), or 
neither. Suppose, then, we have obtained some such statistical function 
from our data on Q, A, Y, or D. How might we be able to tell whether it 
represents a demand function, a budget function, or a hodgepodge 
combination of the two? There are three possibilities: 

1. Suppose, after our calculations are completed, we discover that the 
statistical relationship turns out to take the form Q = F(A, Y, D) in which 
all four variables are present (all of their coefficients are significantly 
different from zero). In that case we know that the statistics have given us a 
mongrel function resembling neither of the relationships we are seeking, for 
the equations of our model tell us that neither of the true relationships 
contains both variables Y and D. 

2. Suppose now that the statistical relationship turns out to have an 
equation of the form F(A, Y) = Q; i.e., D plays no role in the equation. 
Then we can be fairly sure that no budget function component has sneaked 
into our statistical equation, for, if the budget equation had som:show 
gotten mixed into our calculation, the variable D would have shown un in 
our calculated equation, since it is present in the budget curve (4). ‘The 
presence of the variable D would have shown at once that the budget... 
relationship had intruded into our computation. But since, in the case 
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are discussing, we obtain an equation Q = F(A, Y) from which D is absent, 
we conclude that our statistical equation must be an estimate of the demand 
relationship (3) alone. 

3. Similarly, if the form of the statistical equation is F(A, D)=O0;75t 
must represent the budget relationship (4) alone. 

Thus the two variables Y and D, each of which appears in one and only 
one of the two a priori relationships in our model, have permitted us to 
identify both equations. For example,.thé presence of the variable D, which 
oceurs only in the budget equation, acts as a warning signal which notifies 
us at once when the budget equation has somehow got itself mixed in 
with our demand information. 


9. Least Squares Bias in Simultaneous Systems 


Even if it transpires that a set of simultaneous relationships is identified 
so that it is appropriate to investigate them statistically, the analysts 
troubles are not yet over. For the statistical methods which yield satis- 
factory results in determining the nature of a single relationship are apt to 
yield seriously biased results in the presence of simultaneous equations. 

To show one way in which this may come about notice first that any 
economie relationships are constantly subject at least to small shifts as 
the result of minor random occurrences. A sudden change in the weather 
or a newspaper strike affects department store sales, rumors of a price 
rise may lead housewives to stock up on a product, and so on. Conse- 
quently, a demand curve can never be expected to stand still for very 
long. Rather, it is likely to shift back and forth so that its position will 
(at least) vary within a (more or less) narrow band. 

Figure 6a illustrates the band within which our illustrative advertising- 
demand curve usually varies as the result of random disturbances. Suppose 
first that this is a single relationship situation so that the advertising- 
demand curve is the only relevant curve. Observed statistics are then likely 
to fall throughout this band as shown by the points in Figure 6a. The dots 
form a pattern very similar in shape to the demand curve itself. A least 
squares line fitted to these data will then tend to follow the same pattern 
and it will be a rather good representation of the true demand curve which 
the statistician is seeking. 

Now let us contrast this with what happens when there is a second 
relationship present—our advertising budget line. The range of variation 
of these two curves is shown in Figure 6b, where both curves may be ex- 
pected to shift about simultaneously. This means that the intersection 
points of the two curves are likely to move about within the diamond-shaped 
region ABCD. The dots within that region, then, represent the information 
which the statistician observes. 

This time it will be noted that the pattern of dots does not resemble 
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either curve closely. Moreover, a least squares line, LL’, fitted to these 
dots will generally pass approximately through a diagonal of the diamond 
moving upward and to the right from corner A to corner C. It should be 
clear to the reader that a diagonal line of this sort should appear to yield 
a good fit to such a diamond-shaped collection of dots (see Figure 6c). 
But from our state of omniscience in Figure 6b we can easily see that this 
least squares line is really a very poor approximation to the advertising 
demand curve. , 

All sorts of alternative methods have been devised for simultaneous 
equation estimation to avoid these difficulties of the standard least squares 
approach. Aside from the full maximum-likelihood method, which is gen- 
erally too expensive and cumbersome to be employed in practice, several 
alternatives have been designed and employed extensively. Noteworthy 
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are the limited information method, the instrumental variables method, 
and the multiple-stage least squares method (which employs several re- 
peated applications of the least squares technique, designed to correct for 
its deficiencies). All of these are intended to serve as approximations to the : 
maximum likelihood method. 

There is no point in trying to describe these methods here. It is enough 
for our purpose that the reader has been made aware of the statistical 
problems caused by the presence of simultaneous relationships and of the 
fact that methods exist for dealing with these difficulties. 


10. Concluding Comments 


We have seen, then, how difficult it is to find actual demand relation- 
ships in practice. These problems are, to a large extent, a consequence of 
the very peculiarity of the demand function concept itself—the fact that 
it represents the answers to a set of purely hypothetical questions and that 
the information is taken to pertain simultaneously to the same moment of 
time. Unfortunately, this odd demand relationship turns out to be indis- 
pensable to sophisticated decision-making within the firm. We simply 
have to learn to live with it, and to face up to the difficulties involved in 
its empirical determination. An essential part of this process is knowledge 
of the pitfalls which await the unwary investigators who set out to beard 
the demand function in its lair. 


APPENDIX: NOTES ON IDENTIFICATION AND SIMULTANEOUS EQUATION 
ESTIMATION 


The statistical problems created by the presence of simultaneous rela- 
tionships are not confined only to demand analysis or even to economics. 
Simultaneous equation problems are nearly universal and occur in research 
arising out of disciplines as diverse as sociology and medicine. Since inter- 
dependence is so common a phenomenon the reader should have no diffi- 
culty in thinking up all sorts of illustrative statistical problems in which 
simultaneous relationships play a critical role. It is therefore worth going 
into the methods for dealing with such problems in somewhat greater detail. 
This appendix attempts to explain these on an entirely nonrigorous in- 
tuitive basis. ? 


1. Some Identification Theorems 


Consider a model consisting of the following pair of simultaneous 
supply-demand equations for some farm product, in which Q represents 
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quantity sold, P stands for price, Y is an index of consumer income, R is a 
measure of rainfall, S is the amount of subsidy provided to suppliers, and 
U, and U2 are two random shock variables: 


Q = Aw + AuP + AnY + Ui (demand equation) 
(la) Q = A» + AnP 
+ AnRk + AuS + U: (supply equation). 


A-“shock variable" may be roughly defined as a random variable 
which is used to take account of “shocks” or “disturbances” affecting the 
equations. Those disturbances result from changes in other variables 
which affect the system but have been ignored, for.example, because each 
taken by itself is insignificant or because they cannot be observed (e.g., 
psychological variables). A model in which explicit cognizance is taken of 
such disturbances by the insertion of one or more random shock variables 
(usually one in each equation, thus accounting for random “shifts” in its 
graph) is called a shock model. In our model, the first equation tells us that 
the amount of the product demanded will depend on its price and on the 
level of consumer income, while, according to the second equation, the 
supply is determined by the price, rainfall, and the extent of the Federal 
farm subsidy. However, both of these relationships are disturbed by random 
elements (changes in tastes, in the number of plant-eating insects, etc.). 
Here the Aj; are the constant parameters whose magnitudes are to be 
estimated statistically. 

An equation or a set of equations can be identified only if we possess 
enough a priori information about it, i.e., if each of the equations is known 
in advance to be sufficiently distinctive from the others. The most important 
type of such a priori restriction consists of information about which vari- 
ables do, in fact, enter the equation in question. This sort of information 
can be written in the form 


(2) Aij=0 for various specified 7 and j. 


The economic interpretation of this sort of information (or assumption) 
is obvious: It is believed that the variable X; does not play any economic 
role in the ith equation. For example, the dividend payments of a firm may 
not affect the demands for its products. 

More specifically, compare our system of supply-demand equations 
(1a) with the following more general pair: 


Q = Axo + AuP T AnY + Ak F Aus + U; 


(1) 
Q = Ax + AnP + AnY + EP + Aaus + U2. 
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It should be clear that our first equations, (1a), involve the a priori 
restrictions 


(2a) Ais = 0, Ay = 0, An = 0. 


The use for identification purposes of other types of a priori restrictions, 
such as the requirement that just one of the equations involve curvature, 
has been investigated by econometricians. But restrictions of type (2) 
have been used most frequently in practical economic problems. These are 
far easier to handle computationally than the corresponding conditions 
for restrictions of other types, and for many other types of a priori restric- 
‘tion the theory is still in a rather rudimentary form. The theorems which 
are given below (with no attempt to indicate their proofs) will therefore 
be confined to the necessary and sufficient conditions for identification 
with the aid of assumptions of type (2). 

Proposition 1: A necessary condition for the identification of the ith 
equation in a system which is composed of m linear equations with the aid 
of restrictions of type (2) is that at least m — 1 of the parameters occurring 
in equation 4 be zero, i.e., that we have for that equation at least m — 1 
conditions ) 

Ay =.0. 

In our system (1a) we have two equations, so that m = 2. To see whether 
the first equation in that system meets the requirements of the proposition - 
which has just been stated, we note by comparison with the more general 
system (1) that in the first equation A; = 0 and Ay = 0. Thus we have 
more than the minimum of m — 1 = 1 coefficients required to be zero in 
order to achieve identification. We say, therefore, that our first equation 
is overidentified. The second equation, however, has only one coefficient 
missing, i.e., it has only A» = 0, so there is no redundant a priori informa- 
tion here, and we say that the second equation is just identified. 

Proposition 2: A necessary condition for the identification of the kth 
equation in a simultaneous system of linear equations under restrictions 
of tyne (2) is that for every other equation 7 there exists one variable X; 
wi ‘cb appears in equation 7 but not in equation k, i.e., for which 


Ais = 0, A; #0. 


Thus if any two equations in a simultaneous system of linear equations 
contain exactly the same variables, they cannot be identified. But if each 
contains one variable which is absent from the other, they may be identified. 

Proposition 2 becomes important in a system involving more than two 
equations. For example, in a set of three (m = 3) equations, two of them 
may each have two coefficients equal to zero, hence satisfying the require- 
ment of Proposition 1, but if both equations lack exactly the same coeffi- 
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cients, Proposition 2 tells us that they will not be identified. This is illus- 
trated in the following system where the first two equations pass the 
identification test of Proposition 1 but fail the Proposition 2 test: 


Q = Aw + AnXi TU 
Q = An + AnXi TU 
Q = An + AnX2 + AX: + Us 


` Proposition 84: A necessary and sufficient condition for the identification 
of the ith equation in the general system 
AuXi + AvX2 + +++ + AnX, = Ui 
(3) o* ^e c Gy) em oe e e ope 
AmX1 + AsiXs + +++ + AmnXn = Um 
under restrictions of type (2) is the following. Suppose we number our 
variables so that just the first q variables in the system are absent from 
equation 7, i.e., so that (2) becomes 


Ai; — 0, j = 1,2, +++, g. 
Then the matrix, known as the rank criterion matrix, 


Ay Ag e At 
An An ind E 
A* = | Avia Ana ce Are 


Aina Aine ctt Aiia 


Am Amz e Am 


must be exactly of rank m — 1; i.e., it must contain at least one set of 
m — 1 columns and rows which form a nonzero determinant (of order 
m — 1).5 Since the matrix has only m — 1 rows (because it excludes the 
coefficients of the equation, i, whose identification is being investigated), it 
obviously cannot be of greater rank than m — 1. It can contain no higher 
order determinant. 


* This proposition involves considerations which are relatively difficult, and readers 
who are not conversant with elementary properties of matrices and determinants will 
wish to avoid reading the following paragraphs. 

5 For proof see T. C. Koopmans and W. C. Hood, “The Estimation of Simultaneous 
Linear Economic Relationships," pp. 135-42, in Hood and Koopmans (eds.), Studies in 
Econometric Method, Cowles Commission Monograph No. 10, John Wiley & Sons, Inc., 
New York, 1953. 
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With the aid of Proposition 3 a strong presumption about the identi- 
fiability of equation 7 in system (3) under restrictions of type (2) can be 
established by inspection. To do this merely take the A.;, which are equal 
to zero according to (2), i.e., those variables known a priori to be absent 
in the various equations of the system, and substitute zeros for these Aj; 
wherever they appear in the matrix A*. If A* is then identically of rank 
less than m — 1, then equation 7 is not identified. For example, in a four- 
equation system (where m — 1 = 3), suppose we have the A* matrix 


0 0 0 0 


(4) A* =| An Am Am Ane 


Án Az Ass E 


then since this contains no nonzero three-by-three determinant, the equa- 
tion being tested is not identified. On the other hand, if our four-equation 


System had, in testing for identification of one of its equations, yielded the 
A* matrix 


(5) A*—-| 0 Aw O0 0 fj: 
0 0 Ass Aga 


we can be confident that the equation being tested 
cause A* involves at least one nonzero third-o 
subsequent statistical calculation shows that so 
equal to zero]. 


Propositions 1 and 2 can readil 
tion 3. 


will be identified, be- 
rder determinant [unless 
me of the A's in (5) are 


y be shown to be corollaries of Proposi- 


Example 1: In the system 


AisXs = Uy 
A21X1 + A22X2 + A23X3 + A24X4 + A25X5 = Us 
A31X1 + AseX2+ A33X3 + AgaX4 = Us 
AasX5 = U4 


none of the equations is identified. The second and 
requirements of Proposition 1 because they each contain less than m—128 
zero coefficients. The reader should verify that the fourth equation yields the A* 
matrix given by (4) and hence violates the condition given in Proposition 3 as 


third equations violate the 
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does Equation (1). Alternatively, the first and fourth together violate Proposition 
2 because neither contains a variable which is absent from the other. 


Example 2: In the system 
AuX1 =U, 
A22X» = Us 
AaaXs + AsaX4 = Us 
AasX5 = U4 


all the equations except the third are overidentified. For example, the reader 
should check that the A* matrix for the fourth equation is given by (5) and hence 
this equation is presumably identified. It is, in fact, overidentified since this system 
has four equations (m = 4), but four (>3 = m — 1) of the variables do not 
appear. The third equation is just identified since it has exactly m — 1 = 3 
variables missing, and since its A* matrix is "ada. 


An 0 0- 


whose determinant is nonzero. 


2. Criteria for Evaluating Simultaneous Equation Estimation Methods 


As we have seen, an estimation technique is essentially nothing more 
than a method for deciding on the numerical value of some parameter on 
the basis of observed statistical information. A considerable variety of 
alternative estimation techniques exists, and this is particularly true in 
the case of simultaneous equation problems. In deciding which of the 
competing estimation methods to use in coping with a particular problem, 
some obvious practical considerations are relevant. One must ask for each 
of them whether computer programs are immediately and conveniently 
available, which of them has been tested by previous use, and how sparing 
each technique is in its use of expensive computer time. 

In addition to this information it is also obviously necessary to know 
something about the relative virtues of each of these methods from the 
point of view of statistical theory. Here three basic criteria—bias, con- 
sistency, and efficiency—have been employed. These are technical terms, 
whose meaning is not directly related to the everyday usage of the words, 
as we shall see. 

a. Unbiased estimates. An estimating technique is said to produce 


- 


/ 
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unbiased results in some particular circumstances if, were we to go through 
the estimation process an indefinite number of times, the average (arith- 
metie mean) of the estimates obtained from the various samples of size m 
would equal the true value of the parameter to be estimated. That is, con- 
sider an attempt by means of a sampling study to find ;A, the true value 
of some parameter A, e.g., the average height of 1,000 men. Suppose we 
used a sample with twenty-six statistical observations and a given es- 
timating method and obtained an estimate Aj. In the height estimation 
case, Â, might be the average height of some twenty-six men selected at 
randoni from the thousand. Similarly, assume that a second (independent) 

twenty-six-observation. sample were to yield a second estimated value, 

s, etc. Then the estimating technique would be said to yield unbiased 
results in these circumstances if the expected value (the arithmetic mean 
of the Á's),E(A) = (A, + A, + +++ + An)/n, were equal to the true 
value, ,A. 

It is noteworthy that deductive methods can be employed to prejudge 
whether a method will yield biased results. Indeed, it is sometimes possible- 
to predict the magnitude of the bias [the difference between the true value, 
+A, and the average estimate, E(A)]. 

b. Consistent estimates. An estimation technique is said to yield con- 
sistent results in some particular type of circumstance if, as the size of the 
sample on which the estimate is based increases, the estimate approaches 
the true value of the parameter whose value is to bé estimated. Thus, 
suppose this time that 4; is the estimate obtained from a twenty-one- 
observation sample, 42 is obtained from a twenty-two-observation sample, 
etc. Then the estimating procedure is consistent if the successive estimates 
A, Áo, -- , Âm come closer and closer to the true value of the parameter, 
+A, as the sample size, m + 20, approaches infinity. 

It is to be noted that the term “consistency” denotes a property which 
is strictly relevant only in cases involving large quantities of data. Thus it 
is conceivable that some procedures which can be shown to be consistent, 
and therefore perform well in large-sample problems, may behave very 
poorly when utilizing the limited data which so frequently are all the 
economist has available. ® 

A final concept used in evaluating an estimation procedure is 

c. Efficiency. An estimating method is said to be efficient if in seeking 
the true value of a parameter, A, the estimates A,, A; -+-+ which the 
method yields on application to different samples differ from one another 
by 2 smaller amount than the estimates produced by any other method. 


* Though there is at least a superficial resemblance, 
the bias property. If we know that an estimate is unbi 
average" even small sample estimates tend to be good 


the situation is not the same for 
ased, we are assured that “on the 
approximations, 
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That is, suppose we take several estimates of A and calculate their variance 
(or their standard deviation). If it can be shown that this estimation pro- 
cedure yields variance figures no larger than any other possible estimation 
technique, it is said to be efficient. Basically, the statement that an estima- 
tion procedure is efficient asserts that its estimates of a parameter are 
dependable and do not vary very much from sample to sample. 

Using these concepts we can now go on to discuss some of the standard 
estimation methods. 


3. Maximum Likelihood Method: General Description 


The problem of estimation may for present purposes be considered 
that of finding a set of parameters A,; for the system of equations (3) 
where the parameters found must satisfy the a priori restrictions (2) and 
where the estimated values of the parameters must in some sense be those 
which “best” fit the facts (the statistics on which the estimates are based). 
Let us now use an illustrative problem to try to get a rough idea of what is 
involved in the maximum likelihood method of estimation, which is the 
method very frequently considered most desirable by statisticians. 

Suppose someone were to select any integer A, then roll two dice, and 
add to the unrevealed number A the sum of the numbers, U, which came 
up on the faces of the dice. He then tells us that the total of the two numbers, 
A + U, is 16. Our job is to try to guess his selected number, A. In effect, 
we have the equation 


(6) 146=A+U, 


where U is a random shock variable (the sum of the numbers which came 
up on the faces of the dice). The problem is to estimate the value of A. 
Let us see now how one would go about making this estimate by means 
of the maximum likelihood method. The possible numbers or sums on the 
dice range from U = 2 (“snake eyes") to U = 12 (boxcars’’). But not 
all of these values have equal probabilities of occurring. The second row 
of Table 4 shows us, for example, that in repeated tossing of our dice? a 
value U = 9 may be expected to turn up twice as frequently as U = 3. 
We observe, then, that the most likely (frequent) value of U is 7. If 
that were, indeed, the value of U, then A would have to be A = 16 — U = 
16 — 7 = 9. In other words, 9 is the value of A which is most likely to 


7 To see how these numbers are computed, consider, for example, the entry U = 4. 
This may occur in exactly three different ways: The first die may turn up a 1 and the 
second a 3, or the first die may show a 3 and the second a 1, or both dice may come out 2— 
hence the entry 3 in the second row under U = 4. Similarly, a value of U = 2 can only 
occur in one single way, and therefore the first entry in the second row is 1. The rest of 
the second row of the table is constructed similarly. 
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occur, given that A + U = 16. Hence, our maximum likelihood estimate 
of Ais A = 9. 

To summarize, the maximum likelihood method proceeds as follows. 
To estimate the value of some parameter (our A) it takes some observation, 
X (our reported total of 16 for A + U), or observations and asks what 
value of the unknown parameter makes it most likely that the observed 
statistic would have been generated by the equation or equations in 
question. 

Let us now generalize somewhat, replacing our trivial single-observation 
case by one involving several (k) statistical observations, the case which 
usually occurs in practice. It is now convenient to rewrite Equation (6) as 


(7) X;—A-—U,; 
where X, is the observed value of X at time t, etc. For example, if a suc- 


cessive repetition of our dice-tossing experiment had brought up the num- 
bers 18, 12, etc., after the initial total of 16, we would have X; =: 16, X» = 


TABLE 4 


A (= 16 — U) 
————— 


18, X4 — 12, etc. Suppose, moreover, that the U; has a joint frequency 
function, F(Uo, U,---, Us), known a priori. This function, of course, 
corresponds to the entries in the second row in Table 4, which could then 
have been described as F(U). Substitution into this function from (7) gives 
us a frequency function involving only the observed statistical values, 
X, and the unknown parameter A, F(Xo — A, X1 — A, `: +, Xs — A), 
Which we may rewrite simply as G(Xo, X1,---, Xs, A). If the values of 
Xi, X3, - - - are given by observation, the only variable in G(Xo, Xp +>, 
Xi, A) is A. The maximum likelihood estimate of A is then given by the 
value of A for which this function attains its maximum, i.e., for which the 
derivative of G, with respect to A, 


dG(Xo Xy +++, Xr, A) 
dA ; 


is equal to zero. 
We may, in an obvious manner, extend the discussion to the case of a 


more general system like (3). There the random variables, U;, may be 
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considered functions of the parameters A;; and the values of the variables 
Xj: which are to be observed. We must again assume that we know some- 
thing about the likelihood function for the random variables. Let us 
use (3) to express the U; in terms of the A;; and the observed values of the 
variables X ;, and then substitute these expressions for U; in the likelihood 
function, (Ui, ---, Um). We thereby obtain the rewritten likelihood 
function 


(8) L-—$(Xi,: Xu Apoc, Amn); 


which is & function of the observed statisties X;, and the unknown 
parameters A;;. 

Given the statistics we may then estimate the values of the parameters 
as those values of the A;; for which the likelihood function attains its 
maximum subject to the constraints imposed by the a priori information given 
by (2). This is the essence of the idea behind the maximum likelihood of 
estimation. 


PROBLEM 


Calculate the maximum likelihood estimate of A in our dice-throwing experi- 
ment if the dice are thrown twice and the reported sums of A and U are 16 on the 
first throw and 14 on the second. (This problem will be fairly difficult for the reader 
who knows no elementary probability theory.) 


4. Advantages and Disadvantages of the Full-Information Maximum -Likelihood 
Method 


The full-information, maximum likelihood method takes into account 
all of the a priori restrictions (2) for every equation of the system. As has 
just been stated, the equations (2) are treated as constraints. The full- 
information method then proceeds by finding the values of the A;; which 
maximize L in (8) subject to all of the constraints (2). It normally employs 
the classical differential calculus methods for maximization of the value of a 
function subject to equality constraints which were discussed toward the 
end of Chapter 4. 

Aside from its great intuitive attractiveness, the method has several 
advantages from the point of view of statistical theory. In important 
classes of cases these estimates are consistent and efficient. They are some- 
times (but not always) also unbiased. 

However, the method also has two important disadvantages. The first, 
which is a disadvantage to the economic but not necessarily to the sta- 
tistical theorist, is that we must assume that we have some advance knowl- 
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edge about the probability distribution of the random shocks, U;, when in 
fact it seems difficult to visualize economically relevant cases where we 
will even have any grounds on which to base a good guess about the nature 
of this function. In the literature it is customary to assume at least that 
the distribution of such disturbances will be normal, but it is not easy to 
find the basis on which this premise is accepted. 

The second disadvantage of the full-information maximum likelihood 
method lies in the complexity of the calculations which it requires except 
where the probability distribution of the disturbances is normal and the 
equations happen to be just identified, in which case the maximum likeli- 
hood estimates coincide with a form of simple least squares estimate of the 
parameters of the system (3).8 Generally, however, the process of computa- 
tion is apt to be exceedingly slow and expensive, and it is very rarely used 
in practice. As computers with higher speeds and larger memory capacities 
have become available, this disadvantage has, however, become somewhat 
less serious. 


5. Structural Equations and Reduced-Form Equations: Definitions 


A structural equation may be defined, roughly, as one which explicitly 
results from the (economic) theory, as opposed to an equation which is 
obtained by mathematical manipulation of structural equations. Thus, 
our demand-and-supply equations (1a) are structural equations. 

The reduced form of equation system (1) may be described as follows: 
Only the first two variables in the system, Q and P (quantity sold and 
price), are jointly determined, i.e., their values are determined simultane- 
ously by the system of two equations. The other variables, Y (national 
income), R (rainfall), and S (subsidy level), are not determined by our 
supply-demand model. They are called exogenous because their values are 
given from outside the equation system. In general, there will be the same 
number of jointly determined variables as there are equations in the system 
because these equations must suffice to determine the values of the 
dependent (jointly determined) variables. 

Let us now solve our Equations (1) to obtain expressions for the values 
of Q and P in terms of the exogenous variables only. Thus we get equations 
of the form 


Q = Di + DuY + DR + DaS + Ui 
(9) 
P = D» + DaY + DaE + DaS + U^, 


where the D; are constants and the U?'s are random variables. This is 


8 See Koopmans and Hood, op. cit., pp. 140-55. Also see below, Section 6. 
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called the reduced form of the systém of structural equations (1). More 
specifically, in the case of Equations (1a) we can obtain our second (P) 
reduced-form equation (9) as follows. Eliminate Q by setting the two equa- 
tions (1a) equal to one another, thus: 


ym + ÀAuP + ApY TOU. = Aw + AuP + AsR + AnS + Us 
so that b bans: 

P(Au-— An) = Ax — Aw — AnY + Ankh + AuS + Us — Uy 
or f 


Ax — Ai Ay Án 
— Y; 
An — An Em «a An * An FF An 


P= 


This is clearly one of our reduced-form equations (9). The other reduced- 
form equation (for Q) can be obtained similarly by eliminating the P’s be- 
tween our two equations (1a). 

It will be noted that each equation now contains only one of the inter- 
dependent (jointly determined) variables P and Q. None of the variables 
on the right-hand side of the equations is jointly determined. The equations 
of the reduced form are generally not structural. Thus, the equations of the 
general system (3) would be of reduced form only if (3) happened acci- 
dentally to meet the reduced-form requirements to begin with, i.e., if the 
A parameters which are held equal to zero by the a priori restrictions (2) 
were such that each of the equations in (3) contained only one jointly 
determined variable. 


6. The Reduced-Form Method 


As indicated earlier in this chapter, the use of the ordinary least squares 
method to estimate directly the parameters of a structural equation yields 
estimates which are biased. Indeed, they will also be inconsistent, so that 
in terms of the criteria of statistical theory this procedure has iittle to 
recommend it. However, it can be shown that it is normally legitimate to 
use the ordinary least squares method, not in determining parameters Åi; 
for our structural equations, but in estimating the parameters D;; of the 
reduced-form equations. Because all but one jointly dependent variable 
has been eliminated from each reduced-form equation, no equation con- 
tains two mutually interdependent variables, and,’as has been indicated 
in the body of this chapter, it is this mutual interdependence which pro- 
duces least squares bias. In effect; the reduced form is.not a System of M 
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'&iriultanéóus equations but; Lather ited SeLoe M idepe épenilénit'equations, 
all of which happen to hold. Hence, since simultaneous! félatictiships die 
essentially not involved in the reduced form, it is legitimate to estimate the 
parameters of the equations: by means óf ordinary-léast-squares procedures. 

Having found the,D coefficients of the reduced form by least squares, 
it is then necessary to retrace í our steps : anid use these D values to estimate 
the A parameters of our!structutal/equations) (L) or (1a). Specifically, if 
the system is just identifi ; 185 there are just exactly enough a priori 
restrictions (2) tó ensure identification’ of each equation in (3), and no 
more, then we can proceed as follows: (a) estimate the D,; of the reduced 
form: by old-fashioned} least; .squares},,(b); now, substitute the, resulting 

(aduce form Requiiosi into) our structural, equations (3); and find the 
this just-identified casei it, turns out that tho,estima£es of the gtayetoral 
parameters thus obtained by. application. of, least; squares to.the reduced 
form will necessarily be the same as the ai cilia maximum likelihood 
estimates! C 

Unhappily, we are udis so fortunate. The number of constraints 

[a priori restrictions (2)] in’ the:system-are frequently more abundant 
than the minimum necessary for identification, and then no simple and 
inexpensive shortcut to the full-information maximum likelihood estimates 
is known. The difficulty is that only if an equation is just identified is it 
possible to solve in a straightforward manner for the A coefficients of our 
Structural: sahstidos, in terms wio the D. coefficients! of. the reduced-form 
a tae eo vis roii SI ( i p od to aoitibbs < ; 

Aj ‘Tl itustration. Suppose i in our reduced- form equations (8). the estimated 
Values of ihe parameters are pg —5,Du-2, Dij-3 zi 
Doo = 2, Day = —4, Das = 0.5, and Pus — 4, so that our reluca form 
estimated equations become 


Q=5+2Y43R+48 
(10) ee 
P = 2 ceA¥-+.0.5R +48, 


where the random elements Uj and. U5.may for the present be taken to 
have been dropped for simplicity. It will be recalled that our second struc- 


iur uation, in (1a) is just. identifi d beca e it h ess. zi - 
hend ggpatiqn in. Colis iust identifed pecayee it has one less, zero poet 


cient (Ao = ^0) than the number of equations "in the system iG n 25r 
In general terms this equation together with its a priori Bust W (a) 
can be wien 


soMsupe 


a3) A cry 


————-——————————— M "TN 
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We can substitute for Q and P from our reduced-form equations (10) into 
(11). This gives us 
5 + 2Y + 3R + S = A20 + 2Å21 pe 4Ao1Y + 0.5A2,E + 44218 
+ A22Y + Aosk + A248 
= (Ago + 2421) + (—4À21 + Ane) ¥ 
+ (0.5421 + Az3) R + (&Az1 + A24)8. 


(13) 


This equality is supposed to hold no matter what the statistical values 
of our variables. Hence the constant terms on both sides of Equation (13) 
must be equal. Similarly, the coefficients of the Y terms on both sides of 
the equations must be equal, and so on. Writing these conditions out 
explicitly we end up with the four equations® 
Ax + 2An 
—4An + An 
0.5421 + A23 
4An + Au. 


(14) 


rw rm OC 
ll 


These equations have the five unknowns Azo, An; A», Az, and An. It 
is only the addition of the a priori restriction (12) which gives us the fifth 
equation required to solve for the values of the A parameters. And, in 
fact, substituting this zero value of Az into the second equation we readily 


obtain 


An =—0.5 
An = 6 
Ân = 3.25 
Ay = 3, 


as the reduced-form estimates of the parameters of our second structural 
equation (1a). 


? More rigorously, we obtain the first of these equations by setting Y = R= S = 0 
in (13). Similarly, we obtain the second equation by subtracting (14) from (13) and 
setting Y = 1, R = S = 0, etc. This procedure is legitimate, since the equation (13) is 
supposed to be valid for any values of Y, R, and S. 
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It will be noted that we could find these A values only because there ` 
was exactly one a priori restriction (12). If the equation had been over- 
identified and there had been more than one such restriction, we would 
have had more equations to determine the values of the A’s than there 
were unknowns, and so we would not have been able to use the reduced- 
form method to estimate the A’s.!° 


PROBLEM 


Using the reduced-form equstions (10) show that the reduced-form method 
does not permit us to estimate the coefficients of the first of the equations (la). 
"Why? : 

7. The Limited-Information Approach 


Another approach to simultaneous equation estimation, the limited- 
information method, is very powerful in that it allows us to :stimate the 
coefficients of a single structural equation without requiring us to under- 
take any estimation for the remaining structural equations. This fact 
distinguishes it from the full-information maximum likelihood method 
which presents us with estimates for all coefficients in the system. 

The name of the limited-information method refers to the fact that 
we do not employ the a priori restrictions (4;; — 0) which pertain to 
equations other than the one being estimated. Neglect of those other 
a priori restrictions has the distinct advantage that we need not specify 
the exact form of the other equations in the system. (Often there may be 
no practical purpose served by specifying these other equations precisely.) 
We do, however, need some minimal information about these equations. 
The method does require us to specify all of the exogenous variables that 
enter the equations in which we are not directly interested. 

In sum, the limited-information method reduces the work of model 
construction. It also eases considerably the difficulties of computation 
and thus reduces the costs of calculation. However, these gains are un- 
fortunately partially offset by some loss of efficiency: Limited-information 


10 More generally, suppose an m equation system contains m jointly determined vari- 
ables and k exogenous variables. In trying to find the A parameters of one of the struc- 
tural equations from the D parameters of the reduced equations, we will obtain an 
equation corresponding to (13) containing k +1 terms on each side—one constant term 
and one term corresponding to each of the k exogenous variables. Thus we will end up 
with k + 1 equations like (14) to determine the m + k coefficients, A;, of the m + k 
variables, but if, and only if, the equation is exactly identified, it will involve m — 1 
a priori restrictions (2), i.e., exactly m — 1 conditions Aj = 0. Thus we will have k + 1 
equations like (14) and m — 1 equations (2), making m + k equations altogether, the 
same as the number of unknown estimates A;;. If the structural equation is, however, 
overidentified, we will end up with more than m — 1 equations (2) and we will normall; ; 
have too masy equations to solve for the m + k parameters Ay. * 
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estimates will have, greater variability from eagle to panele than do full- 


information maximum likelihood estimates. . RON jä 
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8. The Method bf Initrómental: Variables: 


Two other methods’ of Simultaneous ;quation, estimation merit special 
emphasis. One of these is the method of instrumental variables and the 
other is Theil’s method of two-stage least “squares. It will be seen that both 
of these seek to take advantage of the simplicity of the least squares com- 
putations while Taking s sonié attempt! to reduce its deficiencies i in a simul- 
taneousequation prblem? ^ mt 

As was indicated earlier, if X and Y are “jointly dependent?’ variables 
in a simultaneous system, the least squatesvestimates of-their: coefficients 
will be neither unbiased. nor. consistent. The basic reason for this is that 
does’ Y change . even 
depend nce, the c ange in 


n tur 
TIDD' in Y, n ggi 
consider, the simultaneous pta 


iem! 


am d ine az TS” 
01 tii idw (0 = .) anoijsitje 
If, we ieee our r estimate of; the value of; by in the first, Phan ed on dins dbi 
served. relative, movements, of; Y. and; X;;we are likely to: attribute to b 
more than, just, the: effect.of X oni Yi «45: 
(u "The;method of; instrumental variables, "s this sc es as lüiievas 
Instend;,of estimating;b directly.in terms:of the;changes:in -X..and;:Y; it 
tries to. ‘net, out?);the,relationship by, using: the-exogenous: variable; Z.as 
an instrument for;£his:purpose;;iIn.effect,12 rather than. estimating bby 
means of the.observed; values, of: AY /AX, it: first.computes! AY/AZ,-then 
ealeulates.AX/AZ; and, takes,as its! vec a for-b-the ratio of these bim 


-£ s seeds ,TevevwoH .dg AY/NZ^ jo 21205 ad i bns 
wijaratotni-botimiT :vogoiorflo AKAZ” Y fio vilein T uto? 
whieh ihien os AZs: -— out; should ge us - 


SIL AL 


Since Zi is k a i jointly’ determined variable; neither Y fhor X influences the 
in Are’ both équations in this system moe 
^ Since the equations in (15) are linear, their ‘coefficients may be interpreted ss 28 slopes 
partial) derivatives. Specifically, if we ignore the Ui in the first equation, it is 
cléar that AY/AX cb DA 
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value of Z, and,so neither AY /AZ, nor AX, faz] is distonted, by Sitnultaneeris 
equation feedback.effect.!3 ;... = uu 

This method can be shown to give at least consistent: estimates, Its 
great disadvantage is that where a number of exogenous variables appear 
in the system, the choice of Z, tke*instrunientàlevatiable} is arbitrary; but 
that choice will affect the final results of the calculation. It may be men- 

13 More specifically, it will be recalled from our discussion of the least, squares cd 
in footnote 2 of Chapter 10 that the least squares estimates of a and binY- a+ bX 
are given by the two normal equations. Since in the first equation of our system (15), 
whose parameters we ‘wish’ to estimate, the constant term is missing Gile., we have 
assumed a = 0), the first of the normal equations drops out and the least squares estimate 
of b is given by the|second normal equation. with; the ja! term eliminated: Xj YX. = 
by X?, i.e. by 

gp $23 
"Ene 


In the instrumental-variables method this estimate is replaced by 
fe DYZ AàYZ/2 2x2 
DTA e 


But it will be noted that the numerator and denominator may themselves each be 
interpreted as the least squares estimates 


pe Eee 
Le 


of the coefficients a and £ in a pair of equations 
Y =aZ X = pz 


Now returning to our original simultaneous-equation system (15) in which we want to 
estimate the value of b, assume for the moment that there are no random influences so 
that U.= V = 0. The reader should verify by first eliminating X from Equations (15) 
and then eliminating Y between these equations that the reduced-form equations of the 
system are 


‘bd d 
= Z = 4 
6 Ug and ği up anf 


These reduced-form equations are two equations of the form Y = oZ and. X = BZ. 


Moreover, since 
bd d 
= =b 
ais = (05) / (4) - 


our instrumental-variables estimate 


b* = 4/6 


may indeed be considered to constitute an estimate of b. 
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tioned, finally, that in an equation with more parameters to be estimated, 
we would need as many instrumental variables as there are parameters. 


9. The Method of Two-Stage Least Squares 


Theil’s method of two-stage least squares can also be applied to a 
single equation at a time. It starts from a second interpretation of the 
reason for the inappropriateness of using least squares to estimate b in 
our first equation in (15). This explanation bases itself on the fact that in a 
simultaneous equation model X and Y are both functions of the same error 
or shock variables U and V. That is to say, X and Y are both random vari- 
ables dependent on U and V, and this fact produces an extraneous relation- 
ship between X and Y which makes least squares produce inconsistent 
estimates. To see this more specifically, we note that the reduced-form 
equations corresponding to the structural equations (15) must be of the 
form 


X=A,+ AZ+w 


(16) 
Y 


B, 4- BIZ TW, 


where w and W are random variables which are themselves dependent on 
U and V. If we actually knew the magnitude of w for each date for which 
we have a statistical figure for X, our difficulties would be at an end, be- 
cause we could then rewrite our equation Y = a + bX + U as 


(17) Y —a-Eb(X — w) + U +bw =a +b(X — wv) + R, 


where R is the random variable U + bw. Now the variable on the right 
of (17) is no longer the random variable X. Instead it has been replaced 
by the nonrandom variable X — w, i.e., by the values of X with their 
random components, W, removed. Hence in correlating Y with X — w 
to obtain b, we would be dealing with only one variable which contains a 
random component. It would therefore be perfectly legitimate to use the 
ordinary least squares method to determine a value for b in (17) because 
the two variables would no longer be interconnected by common random 
shocks which can produce spurious correlations between them. 
Unfortunately, we have no way of determining the magnitudes of w 
for each different date for which we have X statistics. At best we can 
estimate w by means of the residuals ú obtained from a least squares regres- 
sion calculated for the first reduced-form equation (16). That is, from this 
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equation we obtain for every date ¢ an estimated value!* of X, £, = 4, + 
ÂZ.. Subtracting this estimated X from the actual value of X for that 
day as given by our statistics, we obtain our estimate of w for that date!5 
as d, = X, — Ê, = (A+ AZ: + w:) — (41+ AZ). Theil then pro- 
poses that w be replaced by ú in (17) and that a second regression (hence 
the name two-stage least squares) be caleulated for (17). Thus, by com- 
paring the ordinary statistical values of the Y's, with the revised values 
X — ú for the X's (which have been doctored to remove the random 
influence, w), we obtain a least squares equation for (17) giving us a revised 
estimate 5. This is the two-stage least squares estimate of 6. 


14 Note that the estimation equation for X contains no random term. In effect, we 
are dealing with the least squares estimating line for the dots in Figure 7. The dots 
represent the actual statistics, and the height of the line gives us our estimated Ê values. 


Figure 7 


The deviations of the actual dots above and below the line are taken to be due to the 
random influences which are measured by w. Thus, the random element does not have 
any influence on the estimated X as given by the line. Rather, it is given by the deviations 
between the actual and the predicted X’s. 

15 More specifically, suppose we have the following statistical values for X and Z: 


Date X (actual) Z X (calculated) $-x-i 
1950 50 20 47 3 
1951 53 25 57 =á 
1952 41 18 43 -2 


and that our reduced-form equation has been estimated as X, = 7+ 22, We easily 
calculate that since in 1950 Z was 20, then the calculated Pm must be 7 4- 2(20) — 47. 
Similarly, we obtain the estimated (column 4) value X:s = 7 + 2(25) = 57 and 
$52 = 43. We note that none of these estimated X's is exactly equal to the actual X 
values as given in the second column of the table. The last column of the table shows the 
calculation of the estimated w for each year by subtraction of the calculated $s from 


the actual/statistical X values. 
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Although the replacement of w by Ô is theoretically undesirable, it 
turns out that X — Ô is "sufficiently". nonrandom that the resulting esti- 
mates of b are consistent. These estimates are not too difficult to calculate 
and Theil's method is generally considered to be a significant addition to 
the list of available methods of estimating the coefficients of structural 
equations in a simultaneous system. 
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Production and Cost 


1. Production, Inputs, and Outputs 


The standard economic discussions of production classify the firm's 
decision variables into only two categories, inputs and outputs. Àn input is 
simply anything which the firm buys for use in its production or other 
processes. An output is any commodity which the firm produces or processes 
for sale. 

The term “processing,” as it is used here, may denote an act of trans- 
portation or storage and does not necessarily imply a manufacturing 
activity. To an economist, all of these may be equally productive acts. 
For example, transportation increases the usefulness of the product by 
bringing it to the location where the consumer needs it—without trans- 
portation the item may be just as useless to him as it would be if it were 
still just a collection of raw materials. Similarly, storage gets the item to 
the consumer when he needs it, just as transportation gets it to him where 
he needs it. The terms “production” and “processing,” then, are used in 
this more general sense, which does not necessarily involve the literal, 
physical transformation of raw materials. 

Management’s production decision problems may be considered to fall 
into four types: 


1. How much, in total, shall be spent on the purchase of inputs? 
2. How shall this amount be divided among the various types of 
input? : 
67 
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3. How much of each type of input will be allocated to each type 
of output? 

4. How much of each final product (output) shall the firm 
produce? 


The answers to the questions are the subjects of this and succeeding 
chapters. 


2. The Production Function 


Decisions on inputs and outputs cannot, of course, be taken inde- 
pendently. There are technological relationships summarized in the 
production function 

y= g(r1, T2, 7*5 Tm), 


which states that y is the marimum amount of commodity Y which the 
firm can produce if it uses exactly rı units of input 1, r2 units of input 2, etc. 

Knowledge of such a functional relationship presupposes that a set of 
optimality calculations has 2 Ircady been carried out, explicitly orimplicitly, 
by the firm's engineers or production managers. They must be taken to 
have examined the many alternative ways in which inputs can be combined 
to produce any given output and to have selected the most efficient way of 
using inputs for each potential output of the firm. Thus their calculation is 
taken to indicate the mazimum output obtainable from any given combina- 
tion of inputs. In the following chapter the nature of this implicit optimality 
calculation will be described in some detail.’ 

Confining our discussion to two inputs to permit the use of two- and 
three-dimensional graphs, we can represent the production function in a 
set of diagrams very closely analogous to consumer indifference maps and 
the utility function in Chapter 9. However, for reasons which will become 
clear presently (Section 6), it is customary and convenient to represent 


1 That chapter discusses the production-decision process using the standard analysis 
of mathematical programming. It shows that this analysis encompasses both the decisions 
discussed in the next few pages and the optimization computation implicit in the 
estimation of the production function. That is, it considers not only how much of each 
output ought to be produced but also at what level each available technological process 
should be employed. This is handled by a simple device. Suppose there are two processes 
for producing shoes. Instead of using one variable, y, to represent shoe output, we 
employ two variables, y1 and ys, to represent the quantities produced by the first and 
second processes, respectively. Then the optimal values of y; and y; can both be calculated 
by the standard programming techniques and these values obviously determine, 
implicitly, the optimal combination of use of shoe manufacturing processes 1 and 2 for 


that level of output. 
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Figure 1 


inputs as negative amounts, i.e., the use of 75 tons of coal is interpreted as a 
75-ton deduction from the coal resources utilized by society. Consequently, 
input axes go either downward or to the left of the origin. 

In Figure la any point R represents a combination of two inputs— 
quantity Orig of cloth and quantity Orezof labor. Suppose with such a 
combination of inputs the largest possible output of aprons is RR’ 
(Figure 1). Then point R’ is one point on the production surface. The 
locus of all such points is a bandshell-like surface, a portion of which is 
shown in Figure 1b as PP'MP" 


3. Relative Input Levels and Production 


We note that, as it is drawn, the production surface does not extend 
up to the labor and cloth axes. The reason for this is that at à point such 
as ror, Which lies on one of the axes, we have a positive quantity of only 
one input. At point rer we have Orzr labor and no cloth. We know that 
with such a combination of inputs it is not possible to produce any aprons, 
i.e., output at point ros must be zero. Hence the production surface at 
that point must be of zero height. That is what also gives the production 
function the roughly upside-down U-shaped cross section PMP’. This 
shape indicates that output cannot be produced with only cloth alone or 
only labor alone and that positive outputs can only be produced by com- 
binations of the two (intermediate points in the diagram). Note that by 
more careful cutting and workmanship it is possible to save on cloth by 
increased use of labor so that aprons can indeed be produced with varying 
proportions of cloth and labor, as the diagram shows. 

There are several respects in which the diagram may be misleading: 


1. The cross section PMP’ need not be symmetrical, for the two inputs 
need not make similar contributions to output, particularly since the 
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choice of units to measure inputs—man-hours and yards—is completely 
arbitrary and the shape of the diagram will change when the unit of meas- 
ure is varied—a square meter of cloth can be used to produce more aprons 
than can a yard of cloth because a meter is slightly longer than a yard. 

2. The cross section PMP’ need not be smooth—it may have dents, 
kinks, or even sharp breaks. For example, consider what happens when 
we move from point R’ in the direction of P’. This involves a reduction in 
the firm's use of labor and an increased use of cloth. At some point along 
this line the labor/cloth ratio may become so low that'it is necessary to 
switch to labor-saving equipment, and this switch may produce a sharp 
rise in output—a break (a sudden rise) in the production surface. 

3. Point M, the highest point on the cross section PMP’, will not 
normally represent an optimal arrangement. True, it represents & tech- 
nologically productive combination of inputs, but whether it will pay to 
employ that input combination depends on relative prices, which we have 
not yet brought into the picture. That is, whether it will pay to use 10 
minutes or 15 minutes of labor per yard of cloth will depend in part on the 
level of wages, no matter what the physical productivity of labor. 


4. Properties of Production Functions: Diminishing Returns 


A standard economic assumption affecting the shape of the production 
function is the “law” of diminishing returns, which is interpreted here to 
mean (eventually) diminishing marginal productivity. This highly plausible 
empirical allegation (which seems to be fairly well supported by experience) 
States that 


As more and more of some input, i, is employed, ali other input 
quantities being held constant, eventually a point will be reached where 
additional quantities of input ¢ will yield diminishing marginal contribu- 
tions to total product. 


The plausibility argument is that, eventually, other inputs will grow 
short relative to input 7, and so additional units of ? will be at a growing 
disadvantage in adding to production. As we hire more and more labor 
but do not supply the increased labor force with additional workroom 
equipment or raw material, further additions to the labor force may well 
be expected to grow less helpful. R 

The effect of this assumption on the shape of the production surface 
is readily shown. Consider line LL’ in Figure 1a. Point B on this line 
h than does point A. But, because LL’ is parallel to the 


involves more clot! : pa 
horizontal cloth axis, both A and B involve the same quantities of labor 
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input. A line parallel to either axis, then, represents the conditions neces- 
sary to test for the presence of diminishing returns—by moving along such 
a line we can increase the use of one input while holding the quantity of the 
other input constant. 

To see now whether the production surface exhibits diminishing 
returns we must investigate what happens to the production level as we 
make such a move (from A toward B on LL’). For this purpose we examine 
the cross section, LSTB, of the production surface taken above line LL’ 
(Figure 2a). This curve, which is reproduced by itself in Figure 2b, is a 
total product curve for cloth given the fixed quantity of labor, OL. We 
note that as it is drawn it flattens out as we move toward the left (its slope. 
diminishes with increasing quantities of cloth input). But we know 
(Chapter 3, Section 4) that the marginal product of cloth is represented by 
the slope of the total product curve. It follows that as we move to the left 
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along LL’, the marginal product of cloth (the slope of LST) is declining, as 
the diminishing-marginal-returns assumption requires. To summarize, if 
(and only if) the production function involves diminishing marginal returns, 
any cross section taken parallel to either input axis will have the gradually 
flattening shape shown in Figure 2b, and not a constant slope (as daes OV 
in Figure 2d) or an increasing slope (OJ in Figure 2d). 


xS a Proparties of Production Functions: Returns to Scale 


So far we have examined what happens when any one input is increased 
by itself. This leads naturally to our next question, what happens when 
all inputs are increased together, that is, when the production process is 
expanded exactly £o scale? For this purpose, let us return again to the floor 
of our production diagram (Figure 1a). Suppose we begin with some input 
combination, A, and ask how a doubling of both of the input quantities 
at A will be represented. We then have the following result: 

Draw the straight line OW, which goes through both the origin and 
point A. Pick point A* on this line OW so that length OA* — twice (k 
times) length OA. Then point A* represents an exact doubling (multi- 
plication by k) of all of the inputs at point A? 

Moreover, it is simple to extend this result to obtain the converse 
proposition that any proportionate increase (decrease) in all of a firm’s 
inputs must be represented by a movement along some straight line, OW, 
from the origin on the floor of the three-dimensional production diagram 
(Figure 2c). Note that this straight line need not bisect the angle formed 
by the axes. For example, in Figure 1a curve OW is relatively close to the 
labor axis because it represents what looks like a large (constant) labor-to- 
cloth ratio. 

This, then, is how we ask our question: To find out what happens to 
production when all inputs increase in the same proportion (an increase to 
scale) we take a cross section OA*Y of our production diagram (Figure 2c) 
cutting along the straight line on the floor from the origin and examine the 
shape of this cross section. There are three possibilities: 


1. Diminishing returns to scale: The curve representing the top of 
the cross section has the shape of P"M in Figure 1b, in which the 
slope decreases toward the left. 


2 Proof: Triangles LOA and L*OA* are similar because they are both right triangles 
and they have angle LOA in common. Therefore their sides are proportional, i.e., we 
have L*A*/LA = OL*/OL = 0A*/OA = k. But OL and OL* are the labor input at 
points A and A*, respectively, and LA and L*A* are the respective cloth inputs at A and 
A*. The result then follows at once. 
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2. Increasing returns to scale: The top of the cross section has an 
increasing slope, as does OJ in Figure 2d. 

3. Constant returns to scale: The cross section line is straight (OV 
in Figures 2c and 2d).? 


This last possibility, which is called the case of a linearly homogeneous 
production function,* has received a great deal of attention in the literature. 
It turns out that such a relationship has extremely convenient mathe- 
matical properties, which make it very useful for purposes of analysis. 
Whenever we are fortunate enough to encounter a production functi 
which (at least approximately) exhibits constant returns to scale, i.e., 
linear homogeneity, we can at once bring to bear a number of special 
theorems, several of which will be described presently. R 

There is also some empirical evidence that the production function for 
the economy as a whole is not too far from being linearly homogeneous. 

Finally, it is almost tempting to argue that production functions will 
necessarily exhibit constant returns to scale. The view is that if, in some 
sense, all inputs are, say, tripled, what is there to prevent all outputs from 
being tripled? After all, if we build three identical factories with identical 
work forces, equipment, and raw materials, will we not obtain three times 
the output of a single factory? In this view, given constant prices, there 
are only two reasons why costs (input use) should not vary in exact 
proportion with output: 


1. Limited input quantities: If we increase outputs, but there are some 
factors whose use cannot expand in proportion because their supplies are 
limited, costs per unit will be driven up because there will be diminishing 
returns to those inputs whose use is increased. 


An alternative proof which also proves the converse, notes that the equation of a 
straight line through the origin is y = az (sceC hapter 2, Section 3). But input quantities 
y and z vary proportionately if and only if y/z = a (a constant), which is precisely the 
same as the equation of the straight line through the origin. 

3 Note that the "spine" of the diagram, OM, which is one such cross section, is also 
a straight line in this case. 

4 This is the mathematical terminology for the case of constant returns to scale. 
See Sections 9 and 10 below. Incidentally, the reader would do well to convince himself 
that a production function can satisfy the “law of diminishing returns" and yet, simul- 
taneously, exhibit constant returns to scale. It should be noted that both of these 
phenomena occur in linear programming problems. 

5 See Paul H. Douglas, “Are There Laws of Production?" American Economic Reveiw, 
Vol. XXXVIII, March 1948; and Robert M. Solow, “Technical Change and the 
Aggregate Production Function," Review of Economics and Statistics, Vol. XX XIX, 
August 1957. Cf., however, K. J. Arrow, H. B. Chenery, B. S. Minhas, and R. M. Solow, 
“Capital-Labor Substitution and Economic Efficiency," Review of Economics and 
Statistics, Vol. 43, August 1961, pp. 225-248. 
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2. Indivisibilities: Some inputs just do not come in small units. We 
cannot install half a blast furnace or half a locomotive (a small locomotive 
is not the same as a fraction of a large locomotive). As a result, only if 
operations are carried on on a sufficiently large scale will it pay to employ 
such indivisible items. This, it is said, is the only source of economies of 
large-scale production. In other words, from this point of view all produc- 
tion functions are linear and homogeneous, only, unfortunately, it is not 
always possible to increase or diminish all input uses in exactly the same 
proportion. 


This position has been criticized on several grounds. First of all, it has 
been maintained that one cannot generally duplicate all of the elements in a 
given situation even in principle. A pair of factories in close proximity 
simply is not the same as a duplication of one factory in isolation. The 
existence of another nearby factory affects labor morale, air pollution, the 
cost of labor-force training, etc. 

More important, suppose one larger factory is more efficient than two 
small factories of similar total capacity. Then there is no motivation for a 
businessman to expand by duplicating his original facilities, even if this 
option is open to him. In other words, if he can obtain increasing returns 
to scale, he can be expected to take advantage of such opportunities when 
he expands his output. 

"There are standard examples of the manner in which such economies 
can arise. For example, it was shown in Chapter 1 that the optimal 
inventory level is likely to increase less than in proportion with the scale of a 
firm’s output. In other words, when the firm doubles its sales, it may be 
foolish to double its inventory expenditures. Here is an increasing-returns 
case—an economy of large-scale production. Another standard example is 
the warehouse construction case. Suppose the work in building a cubical 
warehouse is proportionate to the number of bricks used in its construction 
and that, within limits, the number of bricks depends strictly on the wall 
area of the building. It is a matter of elementary geometry that the wall and 
floor areas will increase as the square of the perimeter of the warehouse 
but the volume of the building (the storage area) will increase as the cube 
of the perimeter. In other words, double the land, bricks, and the brick- 
laying labor and one more than doubles warehouse capacity. Here is 
another case of economies of large-scale production. 

One must conclude that whether or not the production function of a 
particular plant is linearly homogeneous or even approximately so is a 
matter for empirical investigation and cannot be settled by a priori 


considerations. ê 


6 Some of the standard references on this discussion are Nicholas Kaldor, “The 
Equilibrium of the Firm," Economic Journal, Vol. XLIV, March 1934; Paul A. Samuel- 
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6. Notation for Production Functions and Production Sets: Multiproduct Firms 


Elementary discussions usually deal only with single-product enter- 
prises whose output level is y and which uses quantities r4, - + - , tm of m 
different inputs. Then, as we have seen, the production function may be 
written as 


y = g(ru re). 


For reasons which will become clear in the next few paragraphs, it is 
convenient to rewrite this function as 


y — Ilr Tm) = 0, 


which, in turn, can be rewritten as 


f, rss) = 0. 


If we consider the possibility of waste or inefficiency in the productive 
process [to deal with the entire set of feasible production possibilities and 
not just the (efficient) production function], the preceding relationships 
are transformed into the following inequalities: 


y €g(ru---,r&) or y — gri- -s 7m) =f, rs, Tm)< 0. 


‘rhe reason we have gone to the trouble of bringing the y inside the func- 
tional relationship is that with this notation it is extremely easy to proceed 
to the corresponding relationships for a firm that produces a multiplicity 
of outputs. To adapt it to a multiproduct enterprise, with output quantities 
Vr ***, Yn, the preceding form can simply be rewritten 


fg iti Yay Tutto Tm) SY, 


meaning that any one output can be increased either by increasing input 
use holding other output levels constant, or, instead, by reducing other 
outputs without any change in input use (or by some combination of the 
two). 


son, Foundations of Economic Analysis, Harvard University Press, Cambridge, Mass., 
1947, pp. 81-87; Edward H. Chamberlin, “Proportionality, Divisibility, and Economies 
of Scale,” Quarterly Journal of Economics, Vol. LXII, February 1948; "Comments" by 
A. N. McLeod and F. H. Hahn and "Reply" by Chamberlin, same journal, Vol. LXIII, 
February 1949; “Random Variations, Risk and Returns to Scale," Thompson M. 
Whitin and Maurice H. Peston, same journal, Vol LXVIII, November 1954; and 
Harvey Leibenstein, “The Proportionality Controversy and the Theory of Production," 
same journal, Vol. LXIX, November 1955. 
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One last modification in notation simplifies this relationship and 
simultaneously permits us to generalize it somewhat further. Consider 
some commodity such as electricity which is an input for some firms and an 
output of some others. À firm may be both a producer and a user of 
electricity, and its optimality calculation will determine whether or not it 
pays it to produce more than it consumes, i.e., whether it is a supplier of 
electricity to others or whether it produces only enough to meet some of its 
own demand (as in the case of firms that keep standby generators for 
emergencies). If its output is positive, it is a seller, and electricity is then 
one of the firm’s outputs. If its net output is negative, electricity becomes 
one of its inputs. We then may let z; represent either the output or the 
input of item 2, with the convention that the item serves as an output if 
2; > 0 and as an input if z; < 0. This is the primary reason for the con- 
vention that input quantities are represented by negative numbers. Then, for 
brevity, writing w = m + n, the feasible production set becomes 


* Fern +++, 2w) € 0. 


Sometimes it is not convenient to describe the production possibilities 
in terms of a well-defined function such as f(z1, ---, 20) < 0. In that case 
we simply deal with the set of possible input and output combinations. 
That is, we use the 


Definition: The production set, T, is the set of all points (@1,---, uo) 
in w-dimensional space representing all combinations of inputs and outputs 
that are possible given the available resources and the state of technological 
knowledge. 


7. Iso-Product Curves and Production Frontiers 


Consider the production set in two outputs and two inputs given by 


f Y2 T1, 72) € 0. 


If we hold both input quantities constant at some levels, call them ri, r$ 
we can determine all the possible output combinations capable of being 
produced by that pair of inputs. The resulting locus is called a production 
possibility locus, or & production frontier. Figure 3 shows two such curves. 
The curve SS’ shows the output possibilities when the available input 
combination is rf, 7$, while ZZ' is the possibility locus for some other 
(larger) input combination. Point U on SS' is an efficient output combination 
for input combination rt, r$. Point T is feasible but inefficient since it is 


` 
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possible with the given inputs to produce any output combination in 
region TVW, where, in comparison with point T', any such point represents 
an increase in at least one of the two outputs with no reduction in the other. 

The production frontiers have been drawn concave to the origin, 
indicating that there are diminishing returns to specialization in any one 
output. Thus, on production frontier ZZ’, point Z indicates how large an 
output, y», can be produced if all the inputs used along ZZ' were devoted to — 
commodity 2, and, similarly, point Z’ represents the maximal output yi 
if all available inputs were devoted to commodity 1. If returns were 
constant, the feasible input combinations would be represented by the line 
segment ZKZ' connecting Z and Z'. But since the production frontier in the 
diagram is ZBZ’, which lies outside line ZKZ’ everywhere except at its end 
points, it follows that this firm is more efficient in producing a combination 
of outputs (as at point B) than in specializing in either output by itself 
(point Z or Z’). 

A diagram with output quantities (such as y; and y2) on its axes is called 
output space. We can also examine the production set in terms of the input 
combinations corresponding to any given set of output quantities. The 
diagram for this analysis represents input quantities on its axes and is 
referred to as input space. For this purpose, while it is not essential, the 
argument will perhaps be followed more easily if we return to the single- 
output case: the production function f(y, r1, rz) X 0. 

An iso-product locus or a production indifference curve is defined as a 
locus of input combinations all of which are capable of producing the same 
output level. It is a contour line on the floor of the three-dimensional 
diagram representing the “latitudes and longitudes” of points of equal 
height on the production surface. Thus, in Figure 4a, all points on the curve 
marked “10” represent input combinations capable of producing the same 
number of aprons (say 10,000 aprons), as indicated by the 10 next to that 
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indifference curve. These production indifference curves normally possess 
properties which are the same as (or analogous with) those usually assumed 
for consumer indifference curves (see Section 7 of Chapter 9): 


1. They have a negative slope; 

2. If one indifference curve, X, lies farther from the origin than 
another indifference curve, Y, then X normally corresponds to a 
higher output level than Y; 

3. No two indifference curves intersect; 

4. The curves are convex to the origin. 


The rationale of each of these properties is so closely analogous with 
that involved in the theory of the consumer that its investigation is left 
entirely as an exercise for the interested reader. It also follows by an 
argument analogous to that for consumer indifference curves that 


5. The slope of an iso-product locus for inputs rı and rz equals 
—mp;/mps, where mp; is the marginal product of input 1, etc. 


Only one new feature arises in production indifference curve analysis. 
A consumer indifference curv: wan be defined as a line of constant utility. 
But since the entire idea of utility measurement (in this sense) is under 
suspicion, no attempt was made to put numbers next to each consumer 
indifference curve to specify the utility level which it represents. In this 
single-product case the iso-product curve presents no analogous problem 
for the finicky. The output level of a single commodity is a definable 
concept and we need feel no compunction about labelling the curves 5, 10, 
15, etc., as is done in Figure 4a to indicate the production level to which 
each curve corresponds. 


CLOTH P" 
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8. Price Line and Expansion Path 


As previously indicated, the production indifference map (or the 
production surface) represents only technological information..Such data 
alone ordinarily do not permit us to determine the firm’s optimal decisions, 
for we lack price information which can tell us what each input gives us 
for our money. This information is supplied to us by the price line (e.g., 
PP’ in Figure 4a). The price line in production analysis is exactly the 
same as the price or budget line in consumer theory as discussed in Section 9 
of Chapter 9—it is not even a matter of analogy. The price line is again a 
line of constant expenditure—it represents all combinations of labor and 
cloth which can be bought for a fixed amount of money. Thus, for example, 
PP’ in Figure 4a represents all possible combinations of these two inputs 
which together cost exactly $5,000. 

It will clearly be in the interests of the profit-maximizing (or the 
revenue-maximizing) firm to obtain as high a level of production for its 
money as possible. If management is going to spend $5,000, the firm will 
obtain one of the input combinations represented by the points on line 
PP’. Management will want to end up on the lowest iso-product curve 
(the highest output) consistent with this expenditure. This optimal point 
will be the point of tangency, Tı, between PP’ and indifference curve 10. 
For any other point on PP' must lie on a less lucrative indifference curve. 
Point T, then, represents the optimal input combination for the firm if it 
should decide to spend $5,000. 

But suppose the firm considers also what will happen if it spends some 
other amount of money, say $6,000. This will involve a parallel shift in the 
price line, say to P"P'". The optimal input combination (the maximum 
output for the firm's $6,000 outlay) is given by the point of tangency, T. 
Thus, if we draw in eurve EE', the locus of all such points of tangency, 
we obtain what is called the company's ezpansion path. For the given 
relative prices of the two inputs (the slope of the price line), the expansion 
path tells us how the firm's optimal input combination will vary when 
the size of the company's input budget changes. 

The condition that optimal input combinations occur at points of 
tangency between a price line and a production indifference curve is & 
geometric representation of the following basic optimality rule (which 
we will encounter again in this book): 


Proposition 1: An optimal combination of any two inputs, J and J, 
requires that the ratio of their marginal products be equal to the ratio of 
their prices. Symbolically, we must thus have 


mpi/mp; = pip. 
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We have already noted that, by analogy with the argument for consumer 
indifference curves, the slope of any of the production indifference curves of 
Figure 4a equals the marginal product of labor over the marginal product 
of cloth (mpz/mpc). Moreover, since the slope of the price line equals the 
ratio of the two prices (Section 9, Chapter 9), thé preceding rule follows at 
once. Since two tangent curves have the same slope, at a point like T; in 
Figure 4a we must have mpz/mpc = pz/pc. 

The rationale of the rule is also readily explained. Rewrite the equation 
asmp1/p1t = mpc/pc. Now, if one added man-hour of labor produces three 
aprons (mp; = 3) and costs $2 (p; = 2), the ratio mp:/pı = $ = 1$ 
tells us that every additional dollar spent on labor yields 13 aprons. In 
other words mpz/pz is the measure of what the firm gets by putting an 
additional dollar into labor. Similarly, mp c/p cis the corresponding measure 
of the yield of a dollar spent on cloth. If the two happen to be unequal, say 
if mpc/pc = 2, this means that a reallocation of the company budget 
must be profitable—one dollar taken out of labor outlay and transferred to 
cloth purchasing will yield a net increase in output of one-half apron. 
Obviously, then, if mpc/pc exceeds mp ;/p z, the firm must not be buying 
enough cloth to keep the men appropriately busy—the firm’s cloth-labor 
combination cannot be optimal. Only if the two ratios are equal can the 
firm be allocating its input expenditures optimally. 

The optimal choice of input proportions as just described has its 
analogue in the choice of relative outputs, which can be examined with the 
aid of our earlier output-space diagram, Figure 3. 

If the prices of y; and ys are fixed at p; and ps, respectively, the total 
revenue corresponding to any point yi, ys in the diagram is given by 
R = piyi + pay». This relationship is represented by a family of parallel 
straight lines with slope —p/po, 


R 
Y2 = —Pyte, 
P2 P2 


the lines such as T1, T2, or T3 in the diagram. Since any possibility locus 
involves a fixed combination of inputs and hence a fixed expenditure (cost) 
level for the firm, profit maximization requires the firm to select the 
highest iso-revenue line on any production frontier. For example, on 
frontier SS’ the highest attainable iso-revenue line is 74, which is tangent to 


SS’ at point A. 


9. Homogeneous and Homothetic Production Functions 


This section deals with a very important class of production functions 
for which the expansion path takes a particularly simple and significat 


E 
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form. This class of production functions, the homogeneous functions, is a 
generalization of the case of constant returns to scale. They are most easily 
discussed in terms of the single-output case and the most conventional 
notation for the production functions: y = g(r;, - - - , Ta). 

We have the 


Definition: The function y = g(ri, - - - , Ta) is homogeneous of degree s 
in the variables r;,---, 7, if, when the value of each such variable is 
multiplied by the same number, k, the value of the function is multiplied 
by the sth power of k, i.e., if 


gri, +++, kra) = k'y. 


In particular, where s = 1 we have the case of the linearly homogeneous 
function (constant returns to scale), since a proportionate increase in all 
input values (kri then obviously produces an equiproportionate increase 
in'output, ky. 

We note at once that where a production function is homogeneous of 
degree s > 1, increasing returns to scale are present throughout, while if 
s < 1, the function exhibits diminishing returns to scale throughout. 

A moment’s thought indicates why this is so. For example, if s = 2, 
then an equiproportionate increase in all input quantities by the factor 
k > 1 must increase output by the factor k? > k. A special case of par- 
ticular interest is that of a function homogeneous of degree zero (s = 0). 
Here any proportionate increase in r;,---, r, produces absolutely no 

. change in y, that is, it “changes” y to k°y = y. As will be seen in a later 
chapter, this case is of importance in monetary analysis (rather than 
production theory) where it is often asserted that a proportionate change in 
all prices and asset values only represents an inflation in nominal prices 
but no real change in the price of any commodity and therefore has no 
effect on the real quantity of any good that is demanded or supplied. 

The following three functions, which will be shown to be linearly 
homogeneous, indicate that such functions need not be linear: 


Y= 8rrt+2rg — ya—ri/r  ys—any", =O <b <D,. 
In each of these three cases, if we multiply rı and rs by the same constant, 
k, the corresponding y will also be multiplied by k. This is obvious in the 


first (linear) example. In the second example we note that we have, upon 
multiplication of rı and ra by k, 


(r3)? / (krz) = k?ri/k9r$ = kr1/r$ = ky. 


This example should enable the reader to write out immediately homo- 
geneous functions of any desired degree. For example, a homogeneous 


282 Production and Cost Chapter 11 


function of third degree is, by the same argument, y = r1/r$. The third of 
our examples of a linearly homogeneous function is the widely used Cobb- 
Douglas function to which we will return presently. The reader should 
verify its linear homogeneity by substituting kr; and kr; for rı and ro, 
respectively. 

We have just seen that linearly homogeneous functions are not all linear. 
The converse is also true: Not all linear functions are linearly homogeneous. 
For example, in the function y = 3r; + 2r2-+ 6 if we double rı and r2 we 
will clearly not double y. Indeed, we see that any linear function which 
contains a constant term is not homogeneous. 

Before closing our definitional discussion we present one last concept 
which represents a generalization of the concept of homogeneity: 

A homothetic function is any strictly monotonically increasing function 
of any homogeneous function. That is, if y = g(ri,---,7n) is a homo- 
geneous function, then 


z = F(y) = Flg(ri, +++, Tn)] is homothetie if dF/dy > 0. 


Intuitively, a homothetic production function is related to a homo- 
geneous one much as an ordinal utility function is related to a (neoclassical) 
cardinal utility function. It will be recalled that given any set of indifference 
curves we can construct an ordinal utility function consistent with it by 
assigning an arbitrary "utility" number to each indifference number 
provided that this number increases as we move to preferred indifference curves. 
That is, we can transform any such ordinal utility numbers into another 
legitimate set of utility numbers provided that preferred consumption 
bundles are always assigned higher numbers (that is, provided such a 
transformation is monotonic). In the same way, a homothetic function can 
be obtained from a homogeneous function by replacing the set of numbers y 
with a set of numbers z such that if for two values of y we have y* > y**, 
then z*= F'(y*)> F(y**) = z**. That is, the higher of the two values of 
y will always be associated with a higher value? of z. 


10. Some Properties of Homogeneous (Homothetic) Functions 


Homogeneous functions play an important role in various areas of 
economic analysis. We have already seen in Chapter 5 the crucial role of 


7 An example of a homothetic function is z = 3r: + 2r2 + 6, which is clearly not 
homogeneous (multiplying ri and rz by k does not multiply z by a constant power of k). 
However, it is a monotone transform of the linearly homogeneous function y = 3r, + 2r;, 


Part 2 Production and Cost 283 


linéar homogeneity in the linear programming model of production, and 
this role will be emphasized further in the next chapter. A similar place is 
occupied by. the concept in input-output theory. It is important in the 
theory of distribution, as we will see. Homogeneity of degree zero has a 
significant place in monetary analysis, and so on. 

The widespread utilization of this sort of relationship is attributable to 
its. mathematical properties. In this section several of those properties will 
be described and derived, though their importance will in some cases only 
be hinted at and left for more detailed discussion at appropriate points in 
the book. 

We have 


Proposition 2: Euler's theorem. If a production function y = g(rj, - - - »Ts) 
is homogeneous of degree s, then 


9g 9g 
aet +5, ™ = t: 


That is, the partial derivatives of the function, each multiplied by the 
corresponding variable, add up to s times the value of the function.8 

Euler’s theorem has been used to argue that if the production function 
is linearly homogeneous (so that s = 1) and if each input is paid a price 
equal to its marginal product [i.e., the unit price, p;, of resource 7 is pdg/dr; 
and its total payment is p;r; = p(dg/dr,)r;], then the sum of the payments 
to all inputs together will exactly equal the value of total output, py. 


which can be expressed as z = y+ 6. An illustrative nonhomothetic function is 
y = 3rj + 2r3. The reader can verify by the methods of Section 16 that its expansion 
path will satisfy ri/r = pi/pa = constant, which violates the linearity property of the 
expansion path of a homothetic function (Proposition 7). 

8 Proof: By definition, since the function is homogeneous of degree s, 


key = g(kri, +, kra). 


Taking the total derivative of both sides with respect to k we have by the formula for 
total differentiation 


og dkrı 8g dkra og 
mly = — —— L... —— — — $a" —— 
uid a CK We ci d io 


This result must hold for any value of k. In particular, it must be valid for k = 1. But 
for that value of k the preceding equation becomes 


og og 
sy moe ta 


which is Euler's theorem. 
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Next we have 


Proposition 3: If a production function is homogeneous of any degree s, 
the marginal product of each and every one of its inputs will be homo- 
geneous of one lower degree, s — 1. That is, if y = f(ri,---, Tn) is homo- 
geneous of degree s, then any of its partial derivatives, dg/dr;, will itself 
be a function of rj, - -- , Ta that. is homogeneous of degree s — 1.? 


Proposition 4: A homogeneous function of n + 1 variables y, 71, * * * ; Tn 
can be rewritten as a function of the n variables y/r$, r1/rs, : * * ; Ta—1/Tn 


Proof: In the definition of 2 homogeneous function, take k = 1/r, and 
multiply every variable by this value of k. That immediately yields our 
desired result: 

y/rà = g(ri/ra, +++ s Tr—1/Tay 1). 


Thus, the linearly homogeneous production function y = f(K, L) of the 
quantities of capital, K, and labor, L, is often written with the output-labor 
ratio a function of the capital-labor ratio, i.e., y/L = f(K/L, 1) = F(K/L). 


Proposition 5: Given two functions of the same variables the first of 
which is homogeneous of degree s and the second homogeneous of degree t, 
then a third function obtained by dividing the first function by the second 
will be homogeneous of degree s — t. 


Corollary: In particular, if s = t, the function obtained by this division 
process will be homogeneous of degree zero, i.e., multiplication of each 
variable by k will leave the value of the new function totally unaffected. 


The proof of Proposition 5 and its corollary is trivial, for if we write 
Fi(ri*--, Ta) for the first of these functions, F?(-) for the second, and 
F?(-) for the ratio of the first to the second, we obtain 


9 Proof: We are given k'y = g(kri,-++, kra) or k¢g(r1,-++, Tn) = g(Er,- + +, kra). Thus, 
differentiating with respect to rı we obtain 
i 8g(ru:-*,r») _ 3g(kru--- kr) dkry — ag(kry,- +, kra) k 


an akry dn akr, 
Thus dividing through by k we obtain our result: 


agtriy: +5 re) _ gk sy kra) 


-1 
à an E 


That is, if we replace every variable r; in àg/àri by kri, then that derivative is multiplied 
by k*-1, Thus, ðg/ðrı is homogeneous of degree s — 1. Obviously, the same result hoids 
for the derivative with respect to any other input, ri. 
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FP(kri,+++, kra) BÜÜ(,... 5 ra) 

= kF? (ry, +- p Ta). 


Fr, «++, kra) = 


We can now prove 


Proposition 6: If the production function is homogeneous of any degree 
s, then along any ray (a straight line through the origin) the slopes of all 
the iso-product curves will be identical. 


Proof: As has already been noted, for reasons completely analogous to 
the case of consumer indifference curves, the absolute slope of an iso- 
product curve between two inputs, 1 and 2, is the ratio of their marginal 
products, (9g/8r1)/(8g/8r2). But by Proposition 2, since g(r1, 72) is homo- 
geneous of degree s, ðg/ðrı and 0g/órs will both be homogeneous of the 
same degree (s — 1). Hence, by the corollary to Proposition 5 their ratio, 
mr /mr, wil be homogeneous of degree zero. But, along a ray, as we 
increase the quantity of one input, we increase the other proportionately, 
i.e., if we increase rı to kr;, we simultaneously increase rg to kr. Thus, 
any such move along a ray must leave unchanged the slope of the iso- 
product locus, (dg/471)/(dg/dre), since the equation of that slope is 
homogeneous of degree zero.!? 


Finally, we come to the important 


Proposition 7: For any homogeneous (or homothetic) production 
function in two-input variables, any expausion path will be a ray. 


Suppose (Figure 4b) point A on OL is a point of tangency between a 
price line and indifference curve Y. Then any other point, kA, on line OL 
must also be such a point of tangency, because all price lines are parallel 
Gf input prices do not change) and the slope of indifference curve kY at kA 
is the same as that of curve Y at point A, as was just shown. Thus, if any 
point on line OL lies on the expansion path, so will any other point on this 
line. 


10 Tt is easy to show that the same result holds for a homothetic production function 
y = Flg(r r2). For in this case, by the chain rule of differentiation, the ratio of the 
marginal products of rı and rz is 


mp apes n 


mpi dg ðrı| dg órs ðrı| ar. 


Since the function g(ri, r3) must, by definition, be homogeneous, it follows once more by 
Propositions 2 and 5 that mpi/mpz is homogeneous of degree zero. 
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Hence, the expansion path of a linear homogeneous production function 
will always be a straight line. But, as was shown in Section 5, above, 
all points on such a line involve the same input proportions. We conclude 
that with constant returns to scale and fixed input prices there will be just 
one optimal input proportion (say, 8 yards of cloth per man-hour) which 
does not change no matter what the level of the firm’s output. 

This result is quite convenient. For the businessman it means that he 
need only compute one such figure, and as long as input prices do not 
change he has no further input-proportion decision problems. The theorem 
can also be useful for economic analysis, as we shall see in our input-output 
discussion. 


PROBLEM 


Prove that if the production function y = g(ri, r2) is linearly homogeneous 
and the average product of zı is increasing, then the marginal product of x2 must 
be negative. 


11. Cobb-Douglas Production Functions 
The Cobb-Douglas function is a type of linearly homogeneous pro- 
duction function which has proved particularly useful for empirical work. 
The general formula for this function is 
y=arrZ—) — where0 <b <1. 


The property of this function that makes it particularly attractive is 


Proposition 8: The Cobb-Douglas production function is linear if 
rewritten in terms of the logarithms of its variables. 


We have by the usual rules for logarithms of powers and products 
ln y = Ina +b In rı + (1 — b) ln rə. 
That is, if we write y* = In y, a* = In a, etc., we have the linear function 
y* = a* + bri + (1 — 6)r$. 


This means that if one has a Cobb-Douglas function one can take advantage 
of the many simplifications in the process of statistical estimation that are 
possible in the case of a linear relationship. 

One may then ask why do we not simply use a linear production 
function. The answer is that such a purely linear relationship commits 
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us to premises about reality which we do not want to accept. For example, 
with a linear production function y = ur; + vrz +w, the marginal 
product of any input, say input 1, is dy/ar; = u, which is a constant. 
That means that (a) the marginal product of input 1 cannot possibly 
diminish with the quantity rı (it must violate the “law” of diminishing 
returns) and (b) the marginal product of rı must be totally unaffected by 
the available quantity of other inputs. That is, labor must have the same 
marginal product whether it has available to it a large quantity of equip- 
ment, or a small quantity, or none. Obviously, neither of these implications 
of the linearity assumption for the production function is really palatable. 
However, with a Cobb-Douglas function we have 


Proposition 9: The marginal product of an input in a Cobb-Douglas 
production function decreases when the quantity of that input rises 
(diminishing marginal returns) and increases when the available quantities 
of the other inputs rise.!! 


12. Elasticity of Substitution: Response to Relative Input Prices 


So far we have taken input prices to be fixed. This is, for example, a 
crucial premise in the calculation of the expansion path. When this assump- 
tion is dropped, it is helpful to have a measure of the responsiveness of the 
optimal proportions among the firm’s inputs to changes in their relative 
prices. The measure used for this purpose is the elasticity of substitution. 
If a moderate rise in wages relative to the cost of capital leads to a sub- 
stantial replacement of labor by machinery, we say the elasticity of sub- 
stitution is large. On the other hand, if in that case there is little change in 
the capital-labor ratio, the elasticity of substitution is small. Specifically, 
we have the 


Definition: The elasticity of substitution for inputs 1 and 2 is the ratio 
of the percentage change in their relative quantity, r1/rs, to the associated 
change in their relative price, p1/p2. That is, it is given by 


100d(r1/r2) /100d(p1/p2) _ d(ri/rz) pı/p2 


~ n/fa | pi/p2  d(pı/p2) ri/Ta ` 


11 Proof: The first and second partial derivatives of the Cobb-Douglas function with 
respect to rı are (by direct differentiation) 


mp = y[ór, = bark 1r}? (which increases with 72) 
and 


om: 
€ = ayə? = (b — l)bar-"ri-^ <0, — sinceb « 1. 
1 
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It is easy to see that the elasticity of substitution of two inputs in a 
production function varies inversely with the curvature of their iso-product 
curves—the greater their curvature, the smaller the elasticity of sub- 
stitution. In Figures 5a and 5b we see two pairs of price lines, one pair steep 
and one flat. One represents a comparatively low price ratio p1/p» for the 
two inputs and the other a relatively high price ratio. 

In Figure 5a, where they are both tangent to a mildly curved iso- 
product curve, their tangency points T and T’ are far apart. That is, the 
given change in relative prices corresponds to a substantial change in 
input proportion, r2/r1. Thus, in that case, elasticity of substitution is 
high. The reverse is clearly true in Figure 5b. 

The reason for the association between low curvature and high 
elasticity of substitution is not difficult to see. With a given change in the 
slope of the price line (a given change in p,/pz) the flatter the iso-product 
curve, ‘the farther along it one must move to find the new point of tan- 
gency, i.e., the farther one must move to find the point on the iso-product 
curve with a slope equal to that of the new price line. 
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Figure 5c with its right-angled iso-product locus represents the extreme 
case in which elasticity of substitution is zero. A change in the slope of the 
(broken) price lines produces absolutely no change in input proportions. 
On the other hand, Figure 5d represents the opposite extreme—infinite 
elasticity of substitution. A small change in relative prices from the price 
line pa to p» causes a drastic switch from exclusive use of input 1 (point A) 
to exclusive use of input 2. 

There is, of course, a full range of intermediate cases in which elasticity 
of substitution is constant throughout but is neither zero nor infinite. For 
example, a simple but tedious calculation can be invoked to show that the 
Cobb-Douglas production function yields a unit elasticity of substitution 
throughout. This and other production functions with constant elasticity 
of substitution (CES) have proved extremely useful for econometric 
estimation of production relationships.!? 


13. Derivation of Cost Curves? 


From the firm’s expansion path it is fairly easy to find the firm’s total 
and average cost curves. The total cost curve is, as we know, defined as a 
curve which shows how total company outlays vary with its level of pro- 
duction, and the average (per unit) cost curve is defined analogously. 

It will be recalled that price line PP’ in Figure 4a was taken to repre- 
sent an outlay of $5,000. Thus, point of tangency T, on this price line tells 
us that the maximum output obtainable for that outlay is 10,000 aprons. 
This information is represented by point C, in Figure 6. Similarly, point 
C5 in Figure 6 tells us that it will cost $6,000 to produce 15,000 aprons, 
which is the information given by point To in Figure 4a, and so on. The 
curve OC in Figure 6, which is the locus of all points like C1 and C5, is the 
company's total cost curve. 

It is also possible to find the firm's average cost curve directly with 
the aid of Figure 4a. For example, point T, tells us that the unit cost of 
producing 10,000 aprons is $5,000/10,000 = 50 cents. We can find the 
same information for every other point on the expansion path, EE', and 
by recording these data on another graph (not shown) we obtain the 
firm's average cost curve. 

Alternatively, we can use the methods of Chapter 3 to obtain the firm's 
average and marginal costs from its total cost curve in Figure 6. 


12 The classic discussion of this subject is the article by Arrow, Chenery, Minhas, 
and Solow, op. cit. 

13 For a more systematic discussion of the deeper relationships between cost curves 
and production functions, see Chapter 14, Sections 11-15. An illustration of the 
mathematical derivation of a cost function from a production function is provided in 
Section 16 of this chapter. 
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14. Long Run and Short Run: Definitions 


Before we go any further, it is necessary to define a bit of economists’ 
jargon: the terms long run. and short run. 'These do not refer to any fixed 
units of calendar time—we cannot say in advance and without reference 
to a specific problem that a two-month period lies in the short run and that 
a five-year period extends into the long run. 

Rather, these concepts are defined flexibly in terms of the period over 
which the company's commitments extend. The very long run is a period 
80 long that all of the firm's present contracts will have run out, its present 
plant and equipment will have been worn out or rendered obsolete and will 
therefore need replacement, etc. In other words, the long run is a period 
of sufficient duration for the company to become completely free in its 
decisions from its present policies, possessions, and commitments. For 
example, if the company finds that the demand for its product has in- 
creased substantially, it may be ten years before it can afford to redesign 
its plant and equipment completely in accord with the requirements of 
this development. Obviously, even that ten-year figure is flexible. The 
larger the shift in demand, the sooner a reconstruction of the plant will be 
profitable, so that the length of time that can appropriately be considered 
to constitute the long run is itself an economic variable. 

The other extreme case, the very short run, is that where the firm has a 
minimum of free choice. In the very short run a firm will not even be able 
to increase its output in response to increased consumer demand. To do 
this it must acquire more raw materials, perhaps it must arrange for some 
of its labor force to work overtime, and it may also have to hire more 
labor. Even after all of this is arranged, it will take time for the increased 
production flow to begin rolling off the assembly line. In the very short 
run, then; the firm can only satisfy increased demands out of inventory. 

In between these extreme cases, the very short and the very long run, 
there are all sorts of intermediate time periods in which the firm can make 
partial adjustments to any changes in the situation. But in any such in- 
between period it will find its options circumscribed to some extent by 


previous commitments. 
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15. Long-Run and Shori-Run Average Cosis 


'These concepts enable us to examine somewhat further the nature of 
the data which lie behind the firm's cost relationships. g 

Imagine a firm which is considering renting one of four factories where 
the owners of these factories all insist on, say, two-year leases. Call these 
four factories, arranged in increasing order of size, S, T, V, and W. If our 
firm decides to lease factory S, its average cost curve for the next two years 
will (other things remaining unchanged) then be given, say} by curve SS’ 
in Figure 7a. The other U-shaped curves in Figure 7a can be interpreted 
similarly. We see that if, for example, the firm expects to produce and sell 
output OV » it will pay its management to rent factory V, the third largest 
of the available factories. For although factories T and W are also both 
eapable of producing that output, it can be done in plant V at lower unit 
cost than in either of the other facilities. 

Suppose that the firm decides to lease plant T. TT' in Figure 7 will 
then be its average cost curve for the two-year lease period—its short-run 
average cost curve. Once it has committed itself to T, if the firm ends up 
producing output OV m, it will for the next two years have no choice but 
to incur average cost VK. In other words, the U-shaped curves in the 
diagram are the alternative short-run cost curves available to the firm. 

The corresponding long-run cost curve is also apparent from the 
diagram. For before the firm has made its commitment it will be free to 
choose the plant size most appropriate for its anticipated output—it will 
be able to lease that plant which produces its output at the lowest possible 
cost. Thus with output OS, it will want to use plant T and produce at 
unit cost SmL; with output OV. it will want to use plant V so that its 
unit costs will be V „J; etc. In sum, the firm's long-run average cost curve 
will be the heavy scalloped curve SALBJCW’. This curve consists of the 
lowest segments of all of the short-run average cost curves. 

“Cost AVERAGE 
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Sometimes the firm has an unlimited number of alternatives in picking 
its plant capacity. This would be the case if it were having an architect 
draw up plans for a new factory (rather than looking for an existing prop- 
erty to rent). In such a situation there would be an infinity of possible 
short-run cost curves, some of which are represented in Figure 7b. Here 
it is to be noted that the scallops can be smoothed out of the long-run cost 
curve, CNC’. The smooth long-run curve consists, as before, of the “bot- 
tom” of the set of the short-run curves. For obvious reasons it is called the 
envelope of the short-run curves. 

One interesting theorem follows from these drawings: It will not always 
pay to use a plant at an output level where it operates at minimum unit 
costs! For example, in Figure 7a it will pay to rent plant S if it is expected 
that very low outputs will be called for. But suppose the firm anticipates 
turning out output OS, at which plant S is at its “most efficient" (its 
point of minimum unit costs). At this output, plant T is even more efficient 
so that it will pay to use T rather than S to produce OSm. In other words, 
if it pays to rent plant S, it will only be for the production of outputs well 
below the technical “capacity” of that plant! A somewhat similar conclu- 
sion holds for plant W, as an examination of the cost situation at output 
OW ,, will readily show. 

Indeed, in the smooth long-run average-cost-curve case in Figure 7b 
there will be almost no plant which should be used at its point of minimum 
cost. For example, consider plant R, whose short-run cost curve touches 
the long-run curve only at point P (output OQ z). This, then, is the only 
output at which it pays to use plant R. But at P curve RR’ is tangent to 
the long-run average cost curve, which happens to have a positive slope at 
that point. Therefore, at P, curve RR’ must also have a positive slope, i.e., 
P cannot possibly be the minimum point of short-run average curve RR’. 

The only exception is plant M, whose short-run cost curve MM’ touches 
the long-run curve at its minimum point N, for there both these curves 
will be level and so they will both be at their minimum points. 


16. Some Elementary Mathematics of Production Theory'^ 


Proposition 1 of Section 8 on the optimal combinations of inputs for 
the firm is easily derived with the aid of the Lagrange multiplier methods 
of Chapter 4, Section 8. Let the firm's production function be represented 
by 

y = gro T2,** 5 Ta), 


14 We will return to the use of these methods in Chapter 14in the discussion of duality 
theory in production. 
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where r is the quantity of input 1 (say labor) used by the firm, rz is the 
quantity of input 2 (say leather), etc. Given any output level, y*, the 
firm will try to produce y* as cheaply as it can. This means that it is trying 
to minimize its expenditure, m, on the inputs used to produce y*, where 
this expenditure is given by 


m = pili + P2r2 +` pss. 


Here p; is the price of input 1, etc. The firm is trying to minimize m 
subject to the constraint on its operations given by the production func- 
tion. To obtain the Lagrangisn expression for this constrained minimization 
problem, we rewrite the constraint into the standard form 


y* — g&u T2, 7) —0 


and multiply it by the artificial variable, X. Adding this to t!» expression 
for m, which we are trying to minimize, we have our Lagrangian expression 


my = piri + para +--+ + Para + My* — 965 7975s Ta)]. 


It is minimized by setting each of its partial derivatives equal to zero, in 
turn, to obtain 


A 
om one -0 
a, tan 
am; 9, 
X een c sn 
Orn Orn 
om 
m Tt geura T) = 0. 


This is a system of n + 1 simultaneous equations which can presumably 
be solved for the optimal values of our n input variables, ri, 79, ***, Tn, 
as well as the value of the Lagrangian variable, A. 
In particular, to derive Proposition 1 we rewrite the first two of these 
partial derivative equations as 
9g ög 
pı = aF and p2= cO 


and dividing one equation by the other, we obtain, cancelling out the —)’s, 
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which, noting that àg/ór; is the marginal product of rj, etc., gives us our 
result in Section 8. The reader should also convince himself that the 
first n of the partial derivative equations ém,/ér; = 0 determines the 
firm’s expansion path. 


Example 1: Expansion Path 
Find the firm’s expansion path expressed in terms of its total expenditure 
(in dollars) on its inputs, labor, L, and capital, K, given the production function 


y = 2log L+ 4 log K 


and input prices Pz and Px. 
The object is to maximise output y, subject to the expenditure constraint 


PiL+ PK = M. 
This yields the Lagrangian expression 
yr = 2log L+ 4 log K+ (M — PrL — PxK). 


Taking partial derivatives we obtain among our first-order conditions 


m. i am T. 
oL L AP; = 0, lel APL 


9 1$ XPR- jet. 
aK K APx = 0, ie, z = Px. 


Dividing the first equation through by the second we obtain 
K/2L = Pr/Px. 


Thus, with P ; and Px given, the right-hand side is a constant, call it a, and the 
equation becomes K = 2aL, which is a straight-line expansion path through the 
origin. 

Example 2: Supply Function 

Using the production function and input prices of the preceding, problem, 
determine the supply function of the product whose price is taken to be an 
unspecified constant, P, and where P, = 4, Py = 12. 

The object of the firm is to maximize its profits 


I = Py — 4L — 12K 


subject to the constraint imposed by the company's production function as given 
in Example 1. This yields the Lagrangian expression 


Ih = Py — 4L — 12K + X(y — 2 log L — 4 log K), 
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whose partial derivatives at its maximum are 


TPED 

m --1- Rao 

m - 1wz—S=0 

PS = y — 2log L — 4log K = 0. 


‘These equations clearly have the solution 


P 


P 
j=—P, L== K-— 
2’ 3 


P P 
, y = 2log> + 4lgy, 


which constitute our supply function as well as derived demand functions for the 
two inputs. 


Example 3: Cost Function 
Find the expression giving total cost as a function of output y when the 


production function is y = L?+ K?. 

We proceed by solving for K and L as functions of y and substituting the 
result into M = PyL-+ PxK. To find the optimal K and L our Lagrangian 
this time is 

w= L?+ K?--A(M — PLL — PxK), 


which gives us among the first-order conditions 
2L -AP, 2K =dPx 

or 
MS L/K = Py/Px. 

L = (P1/Px)K. 
Substituting this into the production function we have 

y = (P1/Px)°K? + K? = [(P1/Px)? + K?. 
Writing a = (Pz/Px)? + 1 we have 
K = Vy/a 


L = (P1/Px)Vy/s. 


and 
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Thus, the cost function is 


C = PL+ PK 


= (Pi/Px)Vy/a+ Prvy/a. 
PROBLEMS 


Find (a) the expansion path and (b) supply and derived demand function 
relationships for the following production functions and input prices: 
1. y = 8L* + 20K+, Pr =] Pg = 5. 
2. y = —4K? + 2KL — 2L? + 10K + 3L, Pr=1, Pr=2 
3. Prove that the following functions are linearly homogeneous: 
(a) y = 521+ 23.522 
(b) y = 721/z$ + 322/21 
(c) y = 6x}/4x3/4, 
4. Show that y = 3z1-- 472+ 6 is not linearly homogeneous. 


REFERENCES 


American Economic Association, Readings in Price Theory, in George J. Stigler 
and Kenneth E. Boulding (eds.), Richard D. Irwin, Inc., Homewood, Ill., 
1952, Articles 5-13 (especially Article 10, “Cost Curves and Supply Curves," 
by Jacob Viner, also reprinted in R. V. Clemence, Readings in Economic 
Analysis, Addison-Wesley Publishing Co., Inc., Cambridge, Mass., 1950, 
and in Jacob Viner, The Long View and the: Short, The Free Press of Glencoe, 
New York, 1958. 

Carlson, Sune, A Study on the Pure Theory of Production, P. S. King, London, 
1939. : 

Cassels, John M., “On the Law of Variable Proportions,” Explorations in 
Economics, McGraw-Hill Book Company, New York, 1936, reprinted in 
American Economic Association, Readings in the Theory of Income Distribu- 
tion, The Blakiston Company, New York, 1946. 

Henderson, James M., and Richard E. Quandt, Microeconomic Theory, 2nd 
edition, McGraw-Hill Book Company, New York, 1971, Chapter 3. 

Hirshleifer, Jack, “The Firm's Cost Function: A Successful Reconstruction?” 
Journal of Business, Vol. 35, July 1962. 


Malinvaud, Edmond, Lectures on Microeconomic Theory, North-Holland Publish- 
ing Company, Amsterdam, 1972, Chapter 3 (rather difficult). 


Turvey, Ralph, “Marginal Cost,” Economic Journal, vol. 79, June 1969. 


Linear Programming 
and 
the Theory of Production 


l2 


Having summarized the standard neoclassical production analysis 
of economic theory we shall now reexamine the entire subject, this time 
making use of our linear programming equipment. It will be recalled that 
much of our description of linear programming theory in Chapter 5 em- 
ployed as its main illustration the product-line determination problem in 
which the object is to find the optimal combination of commodities to be 
produced by the firm, the quantity of each such item which should be 
turned out, and the processes by which these goods should be produced. 
It is this problem which forms the basis of the linear programming analysis 
of production. The model will therefore constitute the central focus of this 
chapter. 


1. Why a Programming Reexamination of Production Theory? 


It was stated in our previous discussion that a linear programming 
formulation of the production problem involves the implicit assumption 
that the production function is linear and homogeneous. Having gone 
into the meaning and implications of this sort of production relationship, it 


will be profitable to look once more at the linear programming theory of 
production. 
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There are other, more fundamental reasons why it is desirable to look 
at the production decision problem from the programming point of view. 


1. In at least one sense, the programming analysis digs deeper than 
does the neoclassical theory. As has already been stated, the neoclassical 
theory assumes that the optimal technical production processes have some- 
how already been determined before the economic theorist gets to work on 
the problem. This premise is an integral part of the very concept of the 
production function, for, by definition, that function tells us what is the 
largest possible output which can be obtained for every input combination. 
That is, it assumes that optimal processes are employed to make those 
inputs go as far as possible. As we shall see in this chapter, the choice of an 
optimal combination of production processes, ie., of an optimal tech- 
nological arrangement, is no trivial task. It is, however, one which can be 
handled by the methods of mathematical programming. 

2. A second reason it is desirable to reexamine production decision- 
making from a programming standpoint is that the orientation of the 
programmer and that of the businessman have a great deal in common. 
In industry one never hears -° concepts such as the production function 
or the marginal product, eve though it is true that these ideas must lie 
somewhere behind much of management’s thinking. Programming theory, 
though it is, of course, rather abstract and still quite removed from every- 
day managerial parlance, brings us much closer to the language and the 
viewpoint of the business world. 


However, it will turn out that the two types of approach are really not 
so different after all and that the analyst whose training is primarily in 
standard economic theory will find in the programming model a great deal 
that is familiar to him. 

It should be remarked, before proceeding, that in our previous discussion 
of the linear programming analysis of production we focused our attention 
on the choice of product line—i.e., which items should be turned out and 
in what quantities. In the present chapter, partly for variety, we concen- 
trate on the choice of process for manufacturing these items. But the analysis 
is really perfectly general and covers both problems. If we interpret the 
basic variable Q; as the quantity of commodity 7 to be produced, we have a 
product-line analysis. If we interpret Q; as the quantity of our (single) 
commodity to be produced with the aid of process number j, our analysis 
will tell us how much of each process to employ in our technological 
arrangements. Finally, with a slight change of notation we can use as our 
basic variables the double-subscripted symbols Q;; to represent the quan- 
tity of product 7 to be produced with the aid of process j. In that case, the 

same analysis determines both the optimal set of outputs and the optimal 
production process combination at one swoop. 
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2. An Alternative Linear Programming Diagram 


In our geometric representation of linear programming we have so far 
employed exclusively one type of diagram. From our present point of view 
it can be characterized by the fact that its axes were used to measure 
output quantities rather than input quantities. More generally, we might 
say that it concentrated on ends rather than means. In those diagrams 
any point represents a combination of outputs (or processes) which is a 
potential solution to the problem. Inputs only made their appearance in 
the form of the constraint lines, which show us the extent to which the 
limited availability of the inputs restricts the magnitude of the outputs. 

Partly for easier comparison with the diagrams of standard production. 
theory, it is now appropriate to translate the linear programming problem 
into a different diagram, one in which input quantities are measured along 
the axes and outputs are indicated with the aid of production indifference 
curves. In these new diagrams, then, the input requirements become the 
focus of the representation.! 


3. Illustrative Example 


Since the axes in our diagram will represent input quantities, to avail 
ourselves of the simplicity of a two-dimensional diagram we will have to 
take as our illustrative example a two-constraint (two-scarce-input) linear 
program. We will employ the following case as our illustrative example 
throughout most of this chapter: 


A leather-processing company is engaged, among other operations, in 
the dyeing of white suede leather. It is limited in its output by the capacity 
of its dyeing vats and the amount of skilled labor it has available for super- 
vision of its production process. The firm is considering four dyeing pro- 
cesses, or, rather, four variants of its basic procedure. Process 1 involves 
inspection for defects of a sample from each batch before it is put into 
the dyeing vats. Process 2 involves inspection of every individual hide. 
Process 3 also calls for examination of a sample as in process 1, but a con- 
siderably smaller proportion of the hides is inspected. Finally, process 4 


sometimes described by saying that they are represented in requirements space, while 
our previous linear programming diagrams are said to employ solution space. It should 
also be clear that a requirements space representation is possible for any other linear 
programming problem, such as the diet problem, the transportation problem, etc., and 
need not be associated exclusively with production analysis as it is in this chapter. The 
“solution and requirements space" terminology was invented by Charnes and Cooper. 


| RT 
| 1 The reader will readily see why the diagrams which are about to be employed are 
| 
$ 
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avoids the difficult preinspection process altogether. All hides are dyed 
and then quickly examined to see if the dye has “taken” satisfactorily.” 

Let Q, be the quantity of dyed leather to be turned out by means of 
process 1, Q2 be the amount produced by the second process, 2, etc. Then 
suppose we have enough data to specify our programming problem as 
follows: 


maximize profits = 0.99; + 0.75Q2 + 1.0Q3 +1.1Q4 
subject to the constraints 
2Q: + 1.5Q2 + 3.5Qs + 7Q, < 4,000 
0.4Q, + 0.45Q. + 0.35Q; + 0.3Q, < 600 
and the nonnegativity requirements 
Qı 2 0, Q 2 0, Q: > 0, Q, = 0. 


Here 4,000 (gallon-hours per week) is the available vat capacity and 600 
(man-hours per week) is the amount of skilled labor the company can use 
for the production of this dye. Our two limited inputs are thus vat capacity 
and labor. The first coefficient, 0.9 (dollars) in our profit function, repre- 
sents the return per square foot of leather treated by means of process 
number 1, and a similar interpretation holds for the other coefficients in 
the objective (profit) function. The first number, 2, in our upper constraint 
indicates the number of gallon-hours of vat capacity which will be taken 
up by a hide treated by process 1, etc. 


4. The Feasible Region 


We can now proceed to our new diagrammatic representation. The 
depiction of the feasible region is completely trivial. It is shown in Figure 1, 
where, by convention, inputs are shown as negative quantities below and to 
the left of the origin. Here there is constructed a rectangle OA BC bounded 
by segments OA and OC of the two axes, the vertical line AB below the 
point which represents 600 labor hours and the horizontal line CB to the left 
of the point which represents 4,000 gallons of vat capacity. Since a maximum 
of 600 hours of labor time and 4,000 gallon-hours of vat space are available, 
only this shaded rectangular region represents feasible input combinations. 
Thus, for example, point S outside the shaded region represents the use of 
4,000 gallons of vat space and 800 labor-hours. Since that much labor time 


? Clearly, it makes no real difference to the analysis that these processes are all 
described as variants of the same procedure. Process 4, for example, might equally well 
be a totally different procedure which involves the use of fancier labor-saving equipment, 
with no change in the discussion of the remainder of the chapter. 
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of the required skill is not available to the company, point S is simply not 
feasible. 


5. Representation of a Process 


Next we turn to the geometric representation of a production process. 
But first we must take care of a matter of definition. For our purposes, a 
production process is required to involve fixed input proportions. For ex- 
ample, in our illustrative model, process number 1 involves the use of 2 
gallon-hours of vat capacity and 0.4 hours of labor time per square foot of 
output. This means that 10 square feet of leather will require 20 gallon- 
hours and 4 labor-hours, 100 square feet involve the use of 200 vat gallon- 
hours and 40 labor-hours, etc. In other words, no matter how large the 
output produced with the aid of process 1, it will employ 2/0.4 — 5 units 
of vat time per unit of labor time. This constant ratio of input quantities 
is, in fact, a property which we use to help define a production process. 


"Thus, given some two processes, A and B, if procedure A involves 6 hours 


of vat time per unit of labor time, while procedure B involves 4 hours of 
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vat time per unit of labor time, A and B are by definition taken to be 
different processes.? 

With the aid of this definition we can now readily represent a process 
diagrammatically. Since our diagram is constructed to show only inputs, à 
process must be represented in terms of its input requirements. But any 
production process requires inputs in fixed proportions, as we just stated. 
And, as was shown in Section 5 of the previous chapter, the locus of all 
points involving unchanging input proportions is always a straight line through 
the origin. In Figure 2 the line OP, represents process 1. Specifically, we 
see that point D on this line represents the use of 400 hours of labor time 
and 2,000 vat gallon-hours, so that this point involves the correct input 
proportions for process 1: two hours of vat capacity for 0.4 hours of labor, 
and the same is true of any other point on line OP;. Moreover, a moment 
of thought indicates that point D must correspond to the production of 
1,000 square feet of leather per week (since process 1 requires 0.4 labor 
hours and 2 vat hours to make 1 foot). Similarly, point D' represents the 
production of 2,000 square feet of leather. Thus, because it includes all 
points representing the use of 5 gallon-hours of vat time for every labor 
hour,‘ every possible output employing process 1 is specified by some point 
on line OP,. And, conversely, every point on OP, represents an output 
which can be produced by means of process 1 if sufficient quantities of 
resources are available. Incidentally, it should be indicated at this point 
that a "line" such as OP, which starts at some definite point, O, but then 
goes off into space (not stopping at point P), is properly called a ray not a 
line. This terminology is used in most of the programming literature and 
it will also be employed here. 


3 The converse is, however, not part of our definition. That is, even if some procedure, 
C, also involves six hours of vat time per hour of labor time, A and C need not be taken to 
be the same process. An obvious reason is the possibility that A may be vastly more 
efficient and profitable than is C. For example, procedure A may involve the use of six 
hours of vat time and one hour of labor time per square foot of leather, while B may re- 
quire twelve hours of vat time and two hours of labor time for the same purpose. Both 
methods involve the same vat-labor time ratios, but B is clearly twice as costly as A in 
the use of these resources. In any event, it would do violence to common sense to exclude 
by definition the possibility that two different processes are by coincidence equally 
labor intensive, i.e., that they happen to use the same labor-equipment ratios. 

The real peculiarity of our standard linear programming definition of a process is the 
following: suppose that when some procedure, D, is used to produce more than a certain 
output, economies of large-scale production become possible and permit savings in the 
use of vat time. That is, suppose below outputs of 10,000 units the vat-labor time ratio 
is 6, whereas for higher outputs the ratio falls to 4. In that case, our definition forces us 
to say that this procedure really consists of two different processes. 

4 Indeed, the equation of the line is V = 5Lor V/L = 5, where V and L, respectively, 
represent the quantities of vat and labor time used. Hence the slope of » process ray such 
as OP, represents the input ratios of the process. 
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In the same way as we found ray OP, we now see that ray OP» repre- 
sents process 2. This can be checked by observing that any point on this 
ray involves the correct input proportions for process 2: For example, point 
E on this ray represents the use of 3,000 hours of vat time and 900 hours 
of labor, thus satisfying the 1.5 hours of vat time to 0.45 hours of labor 
time requirement of process 2. r 

We end this section by noting that the collection of rays representing 
such processes constitute a (nonfinite) cone-shaped figure, POP; (shaded 
region). As will be shown in the next section, interior points in the cone 
represent the concurrent use of several of these processes. In other words, 
our cone represents the total set of possible production arrangements 
involving processes 1, 2, and 3. 


PROBLEMS 


1. What output is represented in Figure 2 by 
(a) Point D”? 
(b) Point E? 
2. Show that ray OP; represents process 3. 
3. Draw in the ray OP, which represents process 4. 


6. Production Indifference Curves: Construction 


Moving one step closer to the diagrams of classical production theory, 
we can now proceed to construct the production indifference curves of 
our linear programming model, and in a later section we will use these to 
derive the profit indifference curves (iso-profit curves) needed for decision- 
making in the profit-maximizing firm. For the moment let us concentrate 
our attention on just processes 1 and 3. In Figure 3 we see that point Dı 
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represents 1,000 units of output of process 1, while D; represents the same 
output of process 3. Thus the production indifference curve which involyes 
the production of 1,000 square feet of leather must go through both these 
points. Similarly, points F, and F; must both lie on the 2,000-square-foot 
production indifference curve, etc. 

But what about the portion of the 2,000-output indifference curve 
which lies between points F, and F;? It can be shown (see next footnote 
for proof) that this section of the indifference curve must be the straight 
line segment which connects points F, and F;. 

The meaning of this statement is really not so clear as it may at first 
appear. We know that point F, represents the inputs required for some 
level of operation of process 1 and F3 refers in a similar manner to process 3. 
But how can we define an intermediate point such as the midpoint, F, of 
line FF? There is no process ray going through point F and so we cannot 
explain it as a level of operation of any such process. 

Instead, we must take such interior points as F and G to represent the 
input requirements for the simultaneous use of both processes 1 and 3 in 
some combination. That is, F represents an output of 2,000 square feet of 
leather, part of which is produced by means of process 1 and part of which 
is produced by process 3. 

More specifically, it will be shown next that F, the midpoint of the line 
F1F3, represents the inputs needed for 2,000 units of output, produced half 
by one process and half by the other. Similarly, point G, which is 2 of the 
way along F,F3 toward point F1, represents the use of process 1 and process 
3 in the ratio 3/4 to 1/4, i.e., it involves 1,500 square feet produced by 
process 1 and 500 square feet made by process 3. The reader can readily 
extend this interpretation to other points on FFs, always remembering 
that the nearer the point to one of the process lines, the greater the use of 
that process which it involves. 

This interpretation clearly calls for some justification. Let us look more 
closely at midpoint F. It can be seen to require 750 hours of labor time and 
5,500 gallon-hours of vat time. Now it has been stated that F represents a 
1,000-unit process 1 output (point Dı) plus a 1,000-unit process 3 output 
(point D3). Let us examine the input requirements of these two separate 
outputs. These figures, which can be read off the diagram or calculated 
from our constraints are summarized in the following table: 


Inputs Employed Labor Vat Capacity 


Point Di 400 2,000 
Point Ds 350 3,500 
Total 750 5,500 


It will be noted that the total input requirements of D, and Ds together 
turn out to be exactly equal to the coordinates of point F! That is, point F 
represents precisely the total input quantities which would be required to operate 
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both processes simultaneously at the 1,000-square-foot output level. And, in 
the same way, it can be shown that point G represents the inputs needed 
to produce 1,500 units of output by means of process 1 and 500 units of 
process 3 output. Since a similar proposition can be proved* for every point 
on line segment FF’, this justifies our interpretation of the line segment, 
and in particular, shows that FF; is a segment of our indifference curve. 


5 Our theorem states that if 4 and B represent equal outputs produced by different 
processes, then any point, C, or line segment, AB, represents the inputs needed to make 
the same quantity of the commodity by means of a combination of the two processes. 
The converse, that any such combination is represented by a point on AB, will also be 
Shown to be valid. 

Proof: In Figure 4 let OA represent an R-unit operation of process a and OB represent 
an R-unit operation of process b. Let C be any point on line AB. Construct line DC 


INPUT X 


Figure 4 


parallel to OB and line EC parallel to OA. Then OE/OB = AC/AB = some number, k. 
Thus point E represents the fraction, k, of the output at B, i.e., the output level at E is 
kR, and it is produced by process b. 

Similarly, OD/OA = CB/AB = (AB — AC)/AB = 1 — k. Thus, point D represents 
an output of magnitude (1 — k)R, produced by process a. Therefore, together, points D 
and E represent an output of kR + (1 — k)R = R units of production. 

Moreover, the two processes thus combined use OG + OF of input z. But OECD is a 
parallelogram, so that right triangle OFD is congruent with triangle EJC (since OD = 
EC and angle DOF = angle CEJ). Thus OF = EJ = GH. Hence the quantity of z used is 


OG + OF = 0G + GH = OH. 


Similarly, points E and D together involve the employment of OM units of y. Thus, point 
C represents these total input quantities, OH of z and OM of y, as was to be shown. 

Proof of converse: The converse will be demonstrated by means of an algebraic 
argument to illustrate an alternative approach. Let Xa and Y, be the quantities of z 
and y used at point A, and let X; and Y, be the corresponding amounts for point B. 
Finally, let the equation of line AB be Y = aX + £. Then since points A and B both 
lie on this line, we have 


(1) Ya =aXa +6 and Ys=aXs+8. 


Now consider any combination of the two processes in which process a produces kR 
units and process b produces (1 — k)R units so that, together, they manufacture R 
units, as required. Then process a uses kX. units of z and kY. units of y, and process 
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Having established the basic point, that line segment F,F; is part of 
the production indifference curve connecting points F, and Fs, we can now 
construct the rest of the indifference curve as well as other indifference 
curves without any difficulty. In Figure 5 we now have included the rays 
which represent three of our processes 1, 2, and 3. Point Fz on ray OP; 
represents the output of 2,000 square feet of leather by means of process 2. 
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Therefore line F,F; is also part of the 2,000-unit production indifference 
curve. In a similar way it can be shown that broken line E,E,E; is (a 
portion of) the 1,000-square-foot output production indifference curve. 
Other indifference curves in this map can be constructed similarly. 


7. Some Properties of the Indifference Curves 


One may well ask why we do not draw the straight line F;F; (broken 
line) representing the 2,000-unit output combinations of processes 2 and 
3 and consider points on this line to constitute a relevant portion of the 
production indifference curve (or area). The answer is that any point on 


b uses (1 — k)X» units of z and (1 — k)Y, units of y. Thus together they employ kXa + 
(1 — k) X» units of z and kY, + (1 — k)Y; units of input y. To see whether this input 
combination is represented by a point on line AB we must test whether these values of 
X and Y satisfy our equation Y = aX + 8. Substituting our value of X into the 
equation we obtain 

Y = a[kXa + (1 — E)Xs] + 8 = k(aXo + B) + (1 — k)(eX» + B) 
which by Equation (1), above, equals kA. + (1 — k)Ys, so that we have 

kYa + (1 — E)Ys = a[kXa + (1 — E)Xv] +8 

and so our values of X and Y [i.e., X = kX. + (1 — k)Xs and Y = kYa + (1 — K)Y4] 
do indeed satisfy the equation for line AB. Thus, the point representing any such input 
combination does lie on line AB, as was to be shown. (The reader should satisfy himself 
that it must lie on the portion of the line Y = aX + B which lies between points A 
and B.) 
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F2F3 (which does indeed represent a 2,000-unit output combination of 
processes 2 and 3) is necessarily a wasteful arrangement, and it will never 
occur in an optimal solution. For consider any point, G, on F2F3. Corre- 
sponding to any such point there will be points on F5F,Fs, such as point 
H, which lie above and to the right of G. This means that point H uses less 
of both inputs than does G. But H and G both yield the same outputs. 
Therefore G clearly represents an inefficient use of resources, is irrelevant 
for an optimal solution, and can therefore be ignored as far as the 2,000- 
unit production indifference curve F2F',F'3 is concerned. 

We may now observe the characteristic shape of the production in- 
difference curves of linear programming. They consist of kinked line 
segments. Their slope is always negative (or at least nonpositive). Their 
relevant portions are convex to the origin, i.e., they necessarily involve & 
diminishing (or at least a nonincreasing) marginal rate of substitution. 

Thus the production indifference curves of linear programming have 
the same basic shape as do the corresponding curves of neoclassical nroduc-- 
tion theory except for the fact that the latter are usually taken to be ` 
smooth throughout, i.e., they are assumed to contain no kinks or corners. 
This premise is usually employed in the neoclassical theory to make it 


* For the definition of the terms see the preceding three chapters. To justify our asser- 
tion, we need merely note what would occur in an increasing marginal rate of substitution 
case as depicted by curve W2W, W3 in Figure 6. In that case, for reasons which have just 

x o 


Figure 6 


been given, the straight-line segment W2KWs, representing a combination of processes 
Pz and Ps, is more efficient than any point on W:2W,Ws. Thus, that line would be 
irrelevant in an optimal solution and W2KWs, not W2W.Ws, would be the pertinent 
production indifference curve. 

The nonpositive slope follows from a similar argument. If (as in the case of SS’ in 
Figure 10b) there were a positively sloping segment whose highest point is S’, then 
every other point on SS’ would involve larger quantities of both inputs for the same 
output. Therefore, the entire segment (except endpoint S’) could be ignored on: grounds 
of inefficiency. Strictly speaking, since the firm is taken to maximize profits rather than 
output, the preceding arguments are not really applicable to the production indifference 
curves that are under’ discussion here. However, they do hold directly for the profit 
indifference curves described in Section 6. 
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easier to apply the differential calculus, which breaks down at a kink 
(corner) point, such as F in Figure 5, since the slope of the curve at that 
point is not defined. 

Linear programming is, then, compatible with the type of diminishing- 
returns phenomenon represented by a diminishing marginal rate of sub- 
stitution of one input for another (see the discussion of this term in the 
preceding three chapters). That is, if labor can be used to save vat time 
(e.g., by culling out defective pieces of leather which would otherwise 
waste vat space), increased use of labor for this purpose (with output re- 
maining unchanged) may yield diminishing returns, ie., diminishing 
marginal saving of vat time. 

Presently we will see that the ordinary “law” of diminishing returns is 
also compatible with linear programming, i.e., the marginal yield to in- 
creased use of one input may decline, provided the employment of all other 
inputs remains unchanged. However, linearity does rule out diminishing 
returns to scale. That is, as already stated, it implies that the production 
function is linearly homogeneous. The diagram behaves accordingly. Recall 
that such a production function is characterized by indifference curves 
which are parallel in the sense that they all have the same slope along any 
straight line from the origin. (cf. Section 10 of the previous chapter). But 
that is precisely what occurs here. In Figure 5 it is readily verified that the 
slopes of F,F'3 and of E,E3 are equal. For we have (cancelling out all 
minus signs of the negative input values) 


7,000 — 4,000 
slope of F,F; = a —30 
and 
3,500 — 2,000 — 
slope of EE; = ~350 — 400 I: 30. 


Similarly, F;F, and E;E; have the same slopes. This illustrates the fact 
that the production indifference curves have the parallelism property of a 
linearly homogeneous production function." The reader should observe that 


7 This property follows from the constancy of the coefficients in the constraints, 
which imply that if we double our output, sticking to the use of process 1 or any other 
specific process or fixed combination of processes, it will require a doubled use of our 
scarce inputs. That remark by itself shows why the production function in such a case is 
linearly homogeneous. 

The parallelism phenomenon follows from the fact that F, involves twice the inputs 
employed at E; (where, say, X1 and Y; of labor and vat time are used) and F, involves 
twice the inputs (X; and Y;) which are used at Hs. Thus the slope of EXE; is (Ys — Y,)/ 
(X; — X), while the slope of FFs is (2Y; — 2Y:)/(2X; — 2X), which (cancelling out 
the 2’s) is clearly equal to the slope of EiE;. 

More generally, we can show that the corresponding indifference curve line segment 
which involves k thousand units of output also has the same slope as E,E;. For this 
purpose we employ exactly the same method of proof as was just used for FF’; except 
that the symbol k is now substituted for the number 2 throughout the argument. 
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this property incidentally guarantees that the linear programming in- 
difference curves will never intersect. 


PROBLEMS 


1. Show that point G, which is the midpoint between F and F; in Figure 3, involves 
the inputs required to produce 1,500 units of output by means of process 1 and 
500 units of output by means of process 3. 

2. Complete the two indifference curves in Figure 5 by taking into account process 
4 and the corresponding points E, and F4. 

3. Show that it will never be efficient for our firm to use a combination of processes 
2 and 4 to produce 2,000 units of output. (Hint: Draw in straight-line segment 
F3F4 in the diagram which constitutes the answer to Problem 2 and compare 
it with indifference curve FaF 1F3F4.) 

4. Show from the parallelism feature of the indifference curve that your preceding 
answer is valid for any output level, i.e., that it is never efficient to use processes 
2 and 4 together (so long as processes 1 and 3 constitute available alternatives). 


5. Show numerically that FF» is parallel to EE» in Figure 5. 


8. Profit Indifference Curves 


The production indifference curves in Figure 5 can readily be translated 
into profit indifference curves. 

It will be recalled that the four processes are not equally profitable. 
In fact, from our objective function, 


profit = 0.99; + 0.75Q» + 1.005 + 1.1Q4, 


we note that the unit profits of outputs 1, 2, 3, and 4 are, respectively, 
90 cents, 75 cents, $1, and $1.10. 
In Figure 7 we reproduce production indifference curve E;E,E; from 
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Figure 5. Each of points E;, E», and E; represents an output of 1,000 square 
feet of leather. Let us see what points represent outputs which yield $1,000 
in profit. 

Since process 3 yields $1 in profit for every square foot of output, point 
E; represents both 1,000 units of product and 1,000 units of profit, i.e., at 
that point the $1,000 profit and 1,000-unit production indifference curves 
coincide. 

But point E; involves considerably less than $1,000 in profit. Spe- 
cifically, since every unit of process 1 output pays 90 cents, point E, rep- 
resents only $900 in earnings. Thus, to earn $1,000 by means of process 1 we 
have to manufacture 11.1 per cent more, i.e., 1,111 units must be produced 

(since 1,111 x 0.90 = 1,000 approx.): Hence the-point W, on ray OP; 
which lies on the $1,000 profit indifference curve, is $ farther from the 
origin than is Æ. Finally, to find the point of coincidence between this 
profit indifference curve and ray OP», we note that each unit of process 2 
output yields only 75 cents, so that it requires 1,333.33$ units of process 2 
output to yield $1,000 in profit. This is represented by point W2, where 
length OW; is exactly 1.333 times as great as length OE2. [Once again, 
this is so, because of the linearity of our program which implies that it 
takes $ more of both inputs to produce 1,333.33 units of output via process 
2 (600 hours of labor and 2,000 vat gallons) than is required to produce 
1,000 units of output by means of this process (450 hours of labor and 
1,500 vat gallons—see the coefficients of Q% in our constraints).] 

For the same reasons as in the production indifference curves, we can 
again connect points WW Es by straight-line segments and the resulting 
graph will constitute the relevant portion of the $1,000 profit indifference 
curve. Other profit indifference curves can readily be obtained in the same 
way. 

As in the case of the iso-product curves, the profit indifference curve 
will have the parallelism feature characteristic of a linear homogeneous 
production function. A linear program yields constant profit returns to 
scale, e.g., a tripling of all the operations of our firm will triple its profits. 
In other words, we can also obtain additional profit indifference curves in 
Figure 7 directly by just drawing any other "curve" V;ViV; whose seg- 
ments are equal in slope to the corresponding segments of indifference 
curve W2WiEs. 

Observe, finally, that the less profitable a process happens to be, the 
further out will we shift a point on that ray to transfer it from a given 
production indifference curve to a specific profit indifference curve. Thus, 
the move from E; to Wz is (proportionately) greater than the move from 
E, to W, in Figure 7 because process 1 yields 90 cents profit per unit while 
process 2 offers only 75 cents per unit. Suppose, for a moment, that process 
1 were very unprofitable. Then curve W;W;E; might even have become 
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concave to the origin as is curve W2W Ws in Figure 6. In that case, the 
straight-line segment W2K Ws would be closer to the origin than W2W1Ws, 
which means that process 1 is then a totally inefficient profit earner—it is 
simply not worth considering in comparison with a combination of processes 
2 and 3, which can yield the same profits with the use of much smaller 
quantities of input. In exactly the same way, it may happen that a profit 
indifference curve acquires a positively sloping segment (see Figure 10b, 
lines WW’ or SS’). This segment can then be ignored because the process 
at the upper end of the segment (process P in Figure 10b) must be rela- 
tively unprofitable—it takes larger quantities of both inputs to produce 
the same profits than does the other process, P’. 


9. Graphic Solution of the Programming Problem 


We have now obtained a graphic description of both the profit possibili- 
ties (Figure 7) and the feasible region as delineated by the availability of 
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Figure 8 
labor and vat inputs (Figure 1). It is now a simple matter to combine the 
two diagrams by superimposing them one upon the other, and then to 
find the optimal solution of the programming problem. The two diagrams : 
are combined in this way in Figure 8. 

It will be recalled that only points within the shaded rectangle involve - 
input quantities no greater than the amounts available to the company. 
The object of our calculation is to determine how to earn the largest amount 
of profit which can be extracted from the available resources. Thus we want 
to get to the lowest possible profit indifference curve which has any point 
in common with the feasible region. 

Since between any two process rays (the OP’s) the segments of the 
indifference curves have the same slopes, we can, given one of these curves, 
construct as many other indifference curves as we like. In particular, we 
can construct the curve S281853, which just goes through the lower left- 
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hand corner point, B, of the feasible rectangle. This is the linear program- 
ming analogue of the optimal tangency point of classical production 
theory. Point B, then, represents our optimal solution. 

By examining point B we determine the following: 


1. Since B lies on line segment Sı Sz, it involves the use of a combina- 
tion of processes 1 and 3. This illustrates the basic theorem of linear pro- 
gramming—that the solution will usually contain as many nonzero ele- 
ments as there are constraints in the problem. Where, as in our original 
linear program, two constraints are involved, there will usually be no more 
than two production processes employed in an optimal arrangement.® 

2. Our optimal output involves the use of exactly 600 hours of labor 
and 4,000 vat gallon-hours—that is, in this case it involves full use of both 
limited resources of the firm. 


3 If there had been three constraints (three inputs), our diagram would have been 
three-dimensional with three axes to represent the magnitudes of the three inputs 
(Figure 9). The lightly shaded quasi-cubical region (the rectangular prism) is the feasible 
region. The production process loci, OP, OP’, and OP”, are rays in three-dimensional 
space, and together they form a cone with flat sides, OPP'P". Corresponding to line 
segments such as WiWs, S18, and ViVs in Figure 8, we now have the heavily shaded 
triangles WW'W", SS'S", and VV'V" in Figure 9. These triangles now represent 


QUANTITY OF 
p” INPUT Z 
Figure 9 


combinations of the three processes P, P', and P”. Thus, in this three-constraint case the 
optimal point, B, involves the use of three processes at once. Here, unlike the two- 
dimensional case, it may pay to use three processes at once, because, for example, line 
segment S; "^ does not lie between the origin and either segment SS’ or S'S”. Hence 
SS” is not in this sense more efficient than the other segments, and, similarly, neither 
of the other segments is more efficient than S”. Hence none of these can be ruled out 
in advance, and so any point of these segments, as well as any combination specified 
by an interior point of triangle SS'S"', represents a legitimate portion of the produ: ion 
indifference surface. 
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3. Since S28, 5; is a little nearer to W:W1W; (the $1,000 curve) than 
it is to V2ViV3 (the $2,000 curve), it yields somewhat less than $1,500 in 
profit. More specifically, we note that point S; on this same indifference 
curve involves the exclusive use of process 1 and employs about 3,200 
units of vat time (and about 640 labor hours). It must produce an output 
of approximately 1,600 square feet (since process 1 employs 2 units of vat 
capacity per square foot of leather, then 3,200 vat gallons will suffice to 
produce 1,600 square feet). At 90 cents per foot this mea^s a profit of 
about $1,440. In fact, a standard simplex calculation of the optimal solu- 
tion of our programming problem shows that the total profit will be 
$1,471.43. 

' 4. Since point B lies approximately $ of the way to7srd point S, on 
line S,S3, the optimal solution involves approximately 3/5(81,440) = 
$864 of profit on process 1 production and 2/5($1,440) = $576 of process 3 
profits. The precise profit figures yielded by a simplex calculation are 
$900 on process 1 output and $571.43 on process 3 output. 


PROBLEM 


Suppose process 3 were forbidden by law and the company had only processes 
1, 2, and 4 to choose among. Find the optimal solution by grap»;ie methods. Check 
your answer by means of the simplex computation. 


10. Alternative Types of Solutions 


Figure 10 illustrates several other varieties of solution, some of them 
“pathological,” which sometimes occur in linear programming problems. 

Figure 10a represents a rather common situation. Here point B, the 
lower left-hand corner of the feasible region, lies outside the heavily 
shaded cone of production possibilities, POP". In that case, the optimal 
point is S and not B. The firm's resources will then not be used fully. 
Specifically, there will be an unused amount of X whose magnitude is 
indicated by length SB. Moreover, in this situation just one process, P, 
will be employed exclusively. Thus one process variable, Q, and one slack 
variable (the unused output of X) will not be equal to zero, again giving 
us two nonzero variables in this two-constraint case, as the basic theorem 
of linear programming requires. 

A somewhat similar situation is depicted in Figure 10b where, even 
though point B lies inside the production possibility cone, process P is 


: ? Since the unit profits on processes 1 and 2 are 90 cents and $1.00, respectively, this 
implies physical outputs of 864/0.9 — 960 units through process 1 and 576/1.0 — 576 


units via process 3. The simplex caleulation yields the output values 1,000 3 
unita for processes 1 and 3, respectively. Pp ,000 and 5712 
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Figure 10 


highly unprofitable, as is indicated by the positive slope of the profit in- 
difference curves SS’ and WW’. That is, point S yields the same profit 
return as does point S', but since S is below and to the left of S’, it re- 
quires more of both inputs than does S' to obtain these same profits. Hence 
jt will pay to use only process P', the more profitable process, and to use 
it to the full extent permitted by the company’s resources. Thus the optimal 
point, the point on the lowest indifference curve in the shaded feasible 
region, is S’. Like S in Figure 10a, our optimal point S' is a basic solution 
involving one nonzero level of process operation and one nonzero slack 
variable (unused X). 

Figures 10c and 10d represent rather more freakish cases. In Figure 
10c two segments such as V V' and V'V"' of the indifference curves happen 
to form one straight-line segment. Since SS’ and S'S” together coincide 
with SS", there is no disadvantage to using & combination of al three 
processes, i.e., an optimal solution can be found in which we have simul- 
taneously Q > 0, Q' > 0, and Q" > 0. However, there is really nothing 
to be gained by the simultaneous use of all three processes since any three- 
process solution corresponds to an equivalent two-process solution. For 
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example, point B falls both on SS” and on S'S”, so we can do as well by 
employing just processes P' and P" as we would by using all three methods. 

Figure 10d is another odd case in which one of the process rays, OP’, 
happens to go through point B, the lower left-hand corner of the feasible 
region. Here it pays to use only the one process, P'. The basic theorem of 
linear programming is violated in this case since we have Q = 0, Q' > 0, 
Q" = 0, and, since X and Y are both used to capacity, so that both slack 
variables also take the value zero. Thus, despite the fact that there are two 
constraints in our problem, its solution involves only one nonzero-valued 
variable. 

Figure 10d represents the phenomenon which is called degeneracy. In- 
tuitively it means that one process happens by accident to employ resources 
in the right proportion to use up the available resources completely (or, 
in the general case, that this is done by a number of processes smaller 
than the number of different input resources). Computational experience 
indicates that such cases are encountered more frequently than might be 
expected in advance. Degeneracy causes some computational «iifficulties 
but they are not usually very serious. 


PROBLEM 


Show that points O, S’, B, and T are the basic solution points in Figure 10b, 
i.e., they are the points which involve exactly as many nonzero-valued variables as 


there are constraints in the problem (two). (Cf. Chapter 5 for the definition of the 
term “basic solution.") 


11. Marginal, Total, and Average Input Products 


As a final extension of our conventionalization of the linear program- 
ming analysis, let us examine the marginal revenue!? productivity of our 


two inputs. Simply for the sake of variety let us turn at this point to another 
programming problem: 


Max 4Q; + 4Q. 
20: + Q E x 
2Q --3Q < y 
Qa20 20, 


subject to 


where z and y represent the available quantities of the two inputs. 


10 The use of profit indifference curves in our calculation i 
: à tion is the reason it gives 
marginal revenue products. We obtain a measure of the marginal revenue a apita 


rather than its marginal profit yield because the staini a y 
not taken into account in the calculation. SEU E madti wis) baa 
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Our profit indifference map is shown in Figure 11a. ‘The horizontal 
segments such as CD have been added to the indifference curves simply 
to show that beyond a quantity required by either of the two processes 
(see, for example, point C), a further addition to the firm's stock of input 
x by itself (leaving the available quantity of y unchanged) will add nothing 
to output, i.e., it will leave us on the same indifference curve. The reason 
for the vertical segments (e.g., BA) is perfectly analogous. 

Next to each indifference curve a number has also been inserted to 
indieate the profit level it represents. Thus the 8 at the end of the lower 
indifference curve indicates that any point on this curve represents an 
arrangement which will yield a profit of 8 (thousand dollars?). 

Let us now investigate the marginal revenue productivity of input z. 
For this purpose we must keep the quantity of input y fixed and see what 
happens to total product as additional units of x are made available to the 
company. Fixing the quantity of y arbitrarily at y — 4, and adding suc- 
cessive units to the firm's stock of input z, we obtain, in turn, points Z, 
F, G, etc. 

Now at point E we are on the 4 (thousand dollar) profit indifference 
curve, ABCD. Thus, with one unit of z we obtain a total profit of $4,000. 
This is recorded as point e on the total revenue product curve in Figure 
11b. Similarly, point F in Figure 11a indicates that two units of x permit 
the acquisition of six units of profit, and this gives us point f in Figure 11b. 
In this way the entire total revenue product curve can be determined. 
Note that this curve rises steadily up until point s in Figure 11b. To see 
why this is so, note that this segment of the total revenue product curve 
corresponds to the points on line segment RS (along which y = 4) in 
Figure 11a. There indifference curves are crossed at a constant rate as 
input z increases. But once we move to ihe left of point S in 11a (we 
cross ray OP), we leave the vertical segments of the indifference curves, 
and it now takes a larger increase in z to yield a given rise in total revenue. 
Finally, to the left of point H no expansion of the use of z can increase 
revenue any further, and so the corresponding segment of the total revenue 
product curve (the portion to the left of point h) becomes horizontal. 
The reader is left to examine for himself the construction of the average 
and marginal revenue product curves (Figures lle and 11d). He should 
notice that the discontinuities in the marginal revenue product curve occur 
precisely at the input levels where one finds the kinks, s and A, in the total 
revenue curve. This is so, of course, because marginal revenue product at 
any input level is measured by the slope of the total revenue curve (cf. 
Chapter 3, Section 4). 

In our diagrams we observe that the productivity curves of linear 
programming do exhibit diminishing returns and, in particular, diminishing 
marginal products. But the decreases characteristically occur in dis- 
continuous jumps. 
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12. Conclusion 


We have scen in this chapter that the points of view of marginal analysis 
and linear programming are not so different after all. The programming 
view of production is restrictive only in that it assumes that the production 
function is linear and homogeneous and that it deals with changes that 
are abrupt and discontinuous, so that we do not have the smooth indiffer- 
ence and marginal productivity curves of classical production analysis. In 
at least one respect the programming approach probes more deeply than 
the otter because it enables us to see what lies behind the production func- 
tion in terms of the optimal choice of process combinations for any set of 


input or output levels. 
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Comparative Statics 
and Optimization: 
Consumers and Firms 


E 


This chapter undertakes a systematic introduction to a set of 
analytic tools, the calculus methods of comparative statics. These power- 
ful methods have yielded valuable results in a wide variety of subject 
areas including welfare economics, the theory of taxation, stabilization 
policy, and micro- and macroeconomics in general. They have proved 
particularly effective in providifig qualitative conclusions indicating the 
direction (ie. the sign) of the effects of a given change in policy or in 
underlying economic circumstances. The results have also been useful in 
quantitative analysis, that is, in the evaluation of the magnitudes of such 
effects. It is no exaggeration to say that some of the most widely noted 
theorems in economics are the products of comparative-statics analysis. 


1. Comparative Statics: Parameters and Endogenous Variables 


Consider a model used to describe the firm’s production decisions, i.e., 
the quantities of its various outputs and inputs given the prices of all these 
items. In such a case, the input and output quantities can be referred to as 
the endogenous variables, that is, they are the variables whose values are 
determined within the system. On the other hand, the price of an input or 
an output, say the hourly wage rate, is in this case described as a parameter, 
that is, it is a magnitude which may be changed by outside forces but which, 


from the point of view of the behavior of a competitive firm, must be taken 
as fixed. 
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It should immediately be clear that one person’s parameter is another’s 
endogenous variable. To the consumer or the firm, the level of a tax on 
cigarettes is a parameter. To the legislator who may be interpreted to be 
seeking to determine a socially optimal value of that tax rate, the rate is 
an endogenous variable of his optimization calculation. Thus it is the nature 
of the problem being studied, as encompassed in the structure of the 
appropriate model, that determines whether some entity is to be considered 
a variable or a parameter for the purposes at hand. 

Parameters may at first be confused with variables because, ordinarily, 
neither of them is given numerically. The symbol p; representing the 
price of commodity j, may seem as much a variable as y;, denoting the 
output of that commodity. But for a study of competitive firms p; is to 
be interpreted as a constant, albeit one whose magnitude may not be 
known. Moreover, that constant value may conceivably be replaced by 
another constant value in response to change in outside forces. A fiscal 
crisis can force a rise in the sales tax from 5 to 7 per cent. But from the 
point of view of business decisions the latter figure is as much a given as 
was the former. No change in the firm’s output level, in advertising ex- 
penditure, or in the value of any other of its endogenous (decision) 
variables will affect that tax rate once it is determined by the legislature. 

In seeking to select an optimal tax rate the legislature must consider 
the range of reasonable alternatives and must estimate or guess at their 
effects. For example, it may guess at the tax revenues that will accrue if 
the rate is set at 4 per cent or if, instead, the selected figure is 5 or 6 or 7 per 
cent. But that tax revenue figure will depend on the reaction of consumers 
and firms to the magnitude selected. If demands and production levels 
(the endogenous variables of the consumer's or firm's decision problems) 
are very little different under a high than under a low tax rate, a 7 per cent 
sales tax will bring in far more money than a 5 per cent tax, and the reverse 
will clearly be true if demands and outputs are highly responsive to that 
choice. 

It is therefore important to know how the behavior of endogenous 
variables will differ with different parameter values. This is essentially 
what is meant by a problem in comparative statics. More formally, we 
have the definition 


Comparative statics is the comparison of the equilibrium values of the 
endogenous variables of an economic model corresponding to alternative 
values of the parameters selected for study. 


Two features of this definition merit emphasis: 


1. The parameter values investigated are always taken as alternatives, 
not as sequential changes. That is, the issue is what will happen if a 5 per 
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cent tax rate is chosen or if a 7 per cent rate is chosen instead; comparative 
statics does not examine what happens if a 5 per cent tax rate which prevails 
until next January is thereafter replaced by a 7 per cent tax rate. Behavior 
over time never enters a calculation in comparative statics. 

2. Comparative staties concerns itself only with equilibrium values of 
the endogenous variables, i.e., it concerns itself only with the system after 
it has adjusted fully to the selected values of the parameters. This is just 
another side of the static character of the analysis. 


2. Comparative Statics Without Optimization 


Optimization is often an essential ingredient of a comparative-statics 
analysis, as we will see; but sometimes it is entirely absent. A simple 
example is the elementary analysis of the effect of an excise tax on the price 
and output of a competitive industry. Using the Marshallian supply- 
demand diágram (Figure 1), we recall that if SS' is the supply curve in 
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Figure 1 


the absence of a tax, then with a tax rate equal to ST = AC dollars per 
unit of output the supply curve will instead be TT’, which lies uniformly 
above SS' by the amount of the unit tax. We see at once that y;, the equi- 
librium output under the tax, will be less than y,, the equilibrium without 
a tax, a result that is hardly surprising. Moreover, price will be higher 
under the tax p, — ps = BC but the price difference may well be less than 
the tax BC < AC. In other words, pure competition and the slopes of the 
supply and demand curves may force suppliers to absorb part of the tax. 

This, then, is a standard illustration of a comparative-staties analysis 
making no explicit use of any maximization or minimization process. This 
is a characteristic of the use of comparative statics in macroeconomics, for 
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a typical issue in that field is the difference for the equilibrium values of 
national income, employment and inflation rates of alternative modes of 
behavior of the money supply, governmental expenditure, etc. 

Before turning to the comparative-statics analysis that incorporates a 
maximization process, let us use our supply-demand example to re- 
emphasize for the last time a fundamental matter of interpretation. It is 
easy but incorrect to say that the analysis shows that the imposition of an 
excise tax will lead price to rise from its previous level, but by less than 
the amount of the tax. That is an intertemporal interpretation which is 
valid only if none of the other relationships happen to shift during the 
period to which such a statement applies. That is, it assumes implicitly 
that the production costs or demand patterns are not changed either by 
the tax rise itself or by other unrelated influences. But, in any event, 
comparative-statics analysis makes no such intertemporal assertions. In- 
stead, its alternatives always represent substitute scenarios for an identical 
time interval: either a zero tax rate for the next year or a 5 per cent tax 
rate during the same period. 


3. Compcrative Statics and Optimization: Example I—Cournot 


The comparative-statics analysis to which we turn now has a simple 
structure which can easily be lost sight of in the course of the calculation. 
It takes some values to the parameters, £j, * - - , tm, as given, and it sup- 
poses that the values of the endogenous variables, y9, - - - , y9, then emerge 
from an optimization calculation attributed to the decision-maker. One or 
more parameter values are then permitted to vary, usually by the small 
amount dí; and then one calculates the corresponding variations, dy9, in 
the optimal values of the endogenous variables, i.e., the variables under 
the control of the decision-maker. 

Probably the earliest and simplest examples of such a calculation are 
those that were provided by the French mathematician A. A. Cournot in 
his little masterwork of 1838.1 The structure of the analysis is still used 
today in totally unchanged form. 

Cournot’s model, which we turn to now, is that of a profit-maximizing 
monopolist turning out a single product whose quantity is y and which he 
sells at price p. For simplicity, Cournot assumes the cost of providing the 
product to be zero (he describes it as a mineral water which flows cost- 
lessly from the monopolist’s spring). Cournot then shows that if a tax 
rate of ¢ francs per unit had been imposed on the product the monopolist 


1A. A. Cournot, Mathematical Principles of the Theory of Wealth, N. T. Bacon, 
trans. (Irving Fisher, ed.), The Macmillan Company, New York, 1897. 
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would have found it profitable to sell a smaller amount than he would 
have in the absence of the tax. 
The model which shows this is straightforward. Let 


a) y =f) f@) <0 


be the demand function for the mineral water. Then with costs assumed 
zero, the profit function is 


(2) r = (p — dy = pf) — fp). 
Using standard simplified notation for the derivatives we write 


Tp for ðôm/ðp and Tppfor a?r/ðp?, etc. 


Then the requirements for profit maximization are 


(8) T,= f(p) + pf’) — if'@)=0 (first-order condition) 
(4) Tp <0 (second-order condition). 


So far we have carried out no more than the ordinary maximization 
process. Now, however, we ask what change in value of p is consistent with 
maintenance of the equilibrium condition (3) if for the value of the tax 
rate & there is substituted an alternative tax rate, t+ di. To determine 
this we permit t to vary by the amount dé in (3) and simultaneously permit 
the seller's price, p, to vary by the amount dp and see what combinations 
of the two are consistent with maintenance of the equilibrium requirement 
Tp = 0, that is, which values of dp and dé result in a zero change in Tp, 
i.e., in dm, = 0. 

To answer this question we must find dz, the total differential of (3) 
when ż and p are both permitted to vary.” This gives us [since Ty —f' (p) 
by (3)], 


(5) dm, = Typ dp + Tp dt = Tp, dp — f'(p) dt = 0. 
[Here we could, of course, also have calculated an explicit expression for 


Tpp by partial differentiation of (3) with respect to p, but as we will see in 
a moment, that is unnecessary for our purposes.] 


2 The reader may find it helpful to review the discussion of the total differential in 
Section 7 of Chapter 4. 
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From (5) we can now solve for dp/dt, obtaining 
(6) dp = [f'(p)/m5,] dt or dp/dt = f'(p)/Typ- 


From the second-order conditions, (4), we see that the denominator, 75, 
is negative. From the assumption that the demand function (1) has a 
negative slope, we see that the numerator f'(p) is also negative. Hence we 
conclude 


(7) dp/dt = f'(p)/m», > 0, 
and so, by (1), 
(8) dy/dt = (dy/dp)(dp/dt) = f'(p) dp/dt < 0. 


These are the comparative-statics results we were seeking. A higher 
tax rate, dt, will induce the profit-maximizing supplier to charge a higher 
product price and to provide a lower output. 


4. Dissection of the Process: The Crucial Step of Total Differentiation 


As a guide for some of the more complicated examples that follow, and 
in order to make clearer the logic of the analysis, we pause now to charac- 
terize and interpret each of the steps in the comparative-staties calculation. 

The first few elements encompassing relationships (1)-(4) may be 
summarized in the following two obvious steps: 


Step 1: Gather the information that we take as given, including 
premises such as that about the shape of the firm’s demand curve (1) and 
the nature of the objective function. These preliminaries also include 


Step 2: Carry out the optimization calculation, spelling out the as- 
sumption that the second-order conditions, such as (4), are satisfied. This 
premise, which plays a crucial role in this sort of comparative-statics 
analysis, must of course hold for the calculus optimization procedure to 
be legitimate. 


We come next to the critical step which constitutes the-core of the 
process and whose logic requires some explanation: 


Step 8: Set equal to zero the total differential of the first-order con- 
ditions permitting, in the process, variation in the values of all of the 
endogenous variables and of the parameters whose influence is under 
examination. 
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There are several natural questions about this step: (a) Why is it the 
first-order condition (T, in our example) and not the maximand (7) which 
is differentiated totally, and (b) what right have we to set the total 
differential equal to zero? 

The answer to question (a) is that comparative statics deals with 
equilibrium relationships, which we determine with the aid of the first- 
order conditions such as 7, = 0, not the maximand, 7. In any event, we 
do not have a usable relationship between m and the parameters. With 
one value of the parameter it will take some value 7* and with another 
value of the parameter we will instead have m = 7**, but we have no 
prior information about the comparative values of * and 1**. However, 
we do know the equilibrium values of 7? and 72*, the partial derivatives 
of the objective function under the two values of the parameter, for if 
with the one value of the parameter we are to attain equilibrium, then (3) 
tells us we must have 


(9) 7j = 0, 


and, similarly, if we attain an equilibrium under the substitute parameter 
value (as the comparative-statics analysis requires), we must also have 


(10) mi*- 0. 


This tells us at once why we differentiate the first-order expressions Tp 
and not the maximand, 7, for we know precisely what values 7, must have 
with each of the different parameter values but we do not know that about 
ar. Moreover, we can now immediately answer question (b), why we can 
set the differential of T, equal to zero, for we have, by comparison of (9) 
and (10), for the two choices of parameter values 


(11) T? = m7* or dm,-— OO. 


Looked at another way, the monopolist in the illustrative Cournot 
problem is faced with a fait accompli. Instead of the tax rate t he must 
pay the tax rate ¢ + dt. To minimize the resulting damage he must change 
price by that quantity, dp, that restores the first-order conditions. Thus, 
if instead of the parameter value t, the tax rate is set at t+ di, then 
unless p is adjusted we may expect 7; to be affected. The best the monop- 
olist can now do is to adopt a different price p + dp, which offsets any 
change in T, and restores it to zero, as the first-order condition requires. 
That is precisely what is accomplished in the total differentiation equation 
(5), which is the crucial step in the comparative-statics process, because 
it gives us the desired relation between changes in parameter values, di, 
and the corresponding changes in the equilibrium values of the endogenous 
variables such as dp and dy. 
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The remaining steps of the comparative-statics analysis are now easily 
described. 


Step 4: Solve the total differential equations for the derivatives of the 
endogenous variables with respect to the parameter values (dp/dt in the 
Cournot example). Here it is useful to regard the total differential equation 
(5) as a single equation whose two variables are dp and dt. That is, we 
take as our variables for the purposes of this calculation not the price and 
tax rate p and t but the changes in their values. By the fundamental rule 
for total differentiation the variables dp, dt, etc., never enter in a more 
complicated form such as (dp)? or log dp or anything of that sort. The 
total differential equation will always be a linear relationship in the dp, dt 
and the other changes in endogenous variable and parameter values. This 
always makes it much easier to solve for the derivatives such as dp/dt than 
if nonlinear relationships were involved. We come finally to 


Step 5: Evaluate the sign of the derivatives whose expression is ob- 
tained in the preceding step. Note, however, that it is not always possible 
to determine that sign, because in some cases the corresponding derivatives 
can in fact go either way. For example, if x is a consumer’s purchase of 
some good and m is his income, we expect that dz/dm will be negative for 
an inferior good and positive for a normal good. Consequently, we might 
well question any mathematical result which claimed to determine the 
sign of óz/óm unambiguously. But in other cases, as in the Cournot 
model, an unambiguous sign will be arrived at. Here two types of informa- 
tion will normally be helpful: (a) the second-order conditions, such as 
Tpp < 0, and (b) premises about other economic relationships obtained 
from wider considerations; for example, we used in the Cournot discussion 
the premise that the firm’s demand curve has a negative slope. It transpires 
that the second-order conditions always make an appearance in the ex- 
pression for the derivatives obtained in Step 4 and, in particular, that they 
determine the sign of the denominator of that derivative just as they did 
that of Tpp in the solution expression (7) for the Cournot problem. This, 
too, is no mere accident but a necessary consequence of the logic of the 
problem, as will be seen later in this chapter. 


In seeking some qualitative result one must never be hasty in accepting 
the conclusion that the behavior of the term in question is generally 
indeterminate. A sign may appear to be ambiguous at first, and yet some 
ingenuity may suddenly reveal its secret. Very frequently the first-order 
conditions will be helpful here, indicating relationships that prove crucial 
in the solution process. 


E 


—T————————————— NN 
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5. Digression on Second-Order Conditions in Multivariable Models 


Since the second-order conditions play so important a role in the 
comparative-statics analysis, we will have to discuss explicitly the form 
they take in models more complicated than the one we have examined so 
far. The generalized form of those second-order conditions will be reported 
in Section 7. For the moment we will only discuss why these conditions 
are not merely a straightforward extension of the requirements for the 
case z = f(z). As soon as we deal with a model containing a multiplicity 
of variables we enter a new realm of complication if calculus methods are 
to be used exclusively. 

To understand the source of the difficulty we start off by reviewing 
the simplest case of a maximand with one (independent) variable, i.e., 
where the objective is to maximize some function such as 


z = f(x). 


Here the second-order condition assures us that we are dealing with a 
graph such as that in Figure 2a rather than in 2b or 2c. 


o o 
(a) (b) (c) 


Figure 2 


The graph in 2a, it will be recalled, can yield a unique interior max- 
imum? because the first derivative is constantly declining (i.e., the second 
derivative is negative). This means that from the maximum point m it 
does not pay to go in either direction. Where, instead of our simple objective 
function z — f(z), we are attempting to 


maximize z = f(z1, x2) 


3 It will be recalled that an interior maximum is one that does not occur at a 
“corner” of the diagram, i.e., at a point where a constraint, such as a nonnegativity 
condition, prevents further movement of a variable. For example, if we require z > 0, 
then Figures 2b and 2c do have maxima at points B and C, respectively, but these are 
corner maxima, not interior mazima. Cf. Chapter 3, Section 10, above. 
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we hope, analogously, to have a graph such as that in Figure 3. Now in 
such a graph all the second (partial) derivatives must, indeed, still be 
negative. For example, by taking a cross section parallel to the 2, axis of 
the hill in the previous diagram, we see that we obtain a curve RMV with 
the same shape as that in Figure 2a, that is, we will have z11 = a?z/ðx? < 0. 
Hence, we still have as necessary conditions for our calculations the neg- 
ativity of the second partial derivatives, i.e., 


a2) zu <0 Z22 < 0. 


Figure 3 


However, conditions (12) are not sufficient by themselves to do the job. 
That is, we can have cases where (12) is satisfied and yet there is no 
ordinary interior maximum. Such a case is illustrated in Figure 4 by 
surface ARBCVD, whose shape may be described as that of a sagging 
awning going diagonally above the floor of the diagram. As Figure 4 
shows, 222 18 negative because 3 cross section along HJ which is parallel 
to the zz axis yields an inverted U-shaped cross section H IJ, like the cor- 
responding curves in Figures 2a and 3. Similarly, it is not difficult to see 
that in Figure 4 we also have 211 < 0 (cross section EFG). Thus, con- 
ditions (12) are both satisfied and yet surface ARBCVD clearly has no 
interior maximum point M such as those in Figures 2a and 3. 

Tt is not difficult to see what has gone wrong. 211 = 9?z/dz? only tells 
us about cross sections taken in the east-west direction, i.e., cross sections 
parallel to the 71 axis (like that above AB in Figure 3). Similarly, 222 gives 
us information just about the curvature of a cross section cutting through 
the relevant surface in the south-north direction (e.g., like that above HJ 


-—-— 
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in Figure 4). For the entire surface to have the proper curvature like that 
in Figure 3 any and every cross section of the graph of the maximand must 
have an inverted U shape. This must be true not only for cross sections 
taken parallel to the axes, such as those above AB in Figure 3 or above 
HJ in Figure 4, but also for any other cross section such as those above 
KL in Figures 3 and 4. Now it should be clear that the curvature of any 
such cross section goes in the right direction in Figure 3—it also has an 
inverted U shape. But the corresponding cross section above KL in Figure 
4is RIFV, which is an uninverted U ; it is shaped like the curve in Figure 2b 
and is therefore not a well-behaved surface from the viewpoint of max- 
imization. Clearly, what is required is that the graph of the objective 
function (or of the Lagrangian expression in the case where the problem 
has constraints) be concave (downward) along every cross section,‘ as is 


obviously true of the graph in Figure 3. Using the formal definition of 
concavity of a function that was given in Chapters 7 and 9, one can then 
proceed to draw many of the same conclusions as we obtain via the 
calculus methods. The means by which this is done are described in some 
detail in Section 10 of Chapter 14. Alternatively, we can formulate second- 
order conditions in terms of the effects upon the maximand not only of a 
change in 2; itself, holding 22 fixed (or vice versa), but also of simultaneous 
variation in the values of both variables, as indicated by the values of the 
cross-partial derivatives such as 212 = 8?z/0z,025 as well as the 211 and 
the 212. This way of dealing with the matter will be described in Sections 


6-8 of this chapter. 


4 Recall also that sometimes quasi-concavity will do. See Chapter 9, Section 19. 
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6. Example ll: The Slutsky Theorem in a Two-Input Firm 


As an intermediate step toward a description of the full comparative- 
statics analysis, let us turn to a classic example of this process: the deriva- 
tion of the Slutsky theorem for a business firm, the statement that the 
profit-maximizing firm will always use a lower quantity of an input when 
a higher price of that input is substituted for a lower one. 

To make the discussion easier we will take the firm to have only two 
inputs between which to divide its expenditures, which, as we will see in 
the following sections, permits a considerable simplification of the algebra. 

The model requires our firm to minimize the expenditure it devotes to 
its inputs for any given output level, y*, i.e., to 


(13) Min E = pizi + pote 
subject to its production-function constraint 
(14) f(x, z2) = y* (for whatever level of y* happens to be selected), 


yielding the Lagrangian expression (Section 8 of Chapter 4) 


. Q5) L = pizi + pote + My* — f(21, 22)]. 


This is a model with three endogenous variables, z1, £2, and the Lagrangian 
^ (whose optimal value, \°, has been shown in the Kuhn-Tucker analysis 
to give us \° = 9E/9y*, i.e., the marginal cost of an increase in output). 
It also contains three parameters, pı, pz, and y*. A comparative-statics 
analysis can then investigate the effects of a change in the value of any 
one or more of these parameters upon the values of each of the endogenous 
variables. Here we will be concerned with just one of these comparative- 
statics relationships, with 0z;/9pi. 


To get this value, we start off, as usual, by differentiating (15) in turn 
with respect to the variables %1, z2, and A to obtain the first-order con- 
ditions® 


5 The subsequent discussion can be simplified somewhat by using the first two 
equations in (16) to eliminate a. This gives us, instead of the three equations in (16), the 
combined equation pi/pa = fi/fe plus the last of Equations (16). This step is avoided 
here for two reasons: first, because for expository purposes it is desirable to begin with 
the most standard procedures, and, second, because variables suppressed in ihe way 
just described may remain hidden in the model and may constitute a potentis! source 
of error. 
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9L/óz, = pı — M1— 0 
(16) OL/dx2 = pe — Ma = 0 
8L/8X = y* — f(z1, 22) = 0. 


For the reasons discussed in the previous section, the second-order con- 
ditions of this problem with its multiplicity of variables is not a simple 
extension of the single-variable case. We would expect that in this min- 
imization problem they require (writing L1: for °L/dz3, etc.) 


(17) Iyn=—-Mi>0, Lee = —Afe2 > 0 


(though we note by direct differentiation that Lya = 0 and is consequently 
not negative). Since from (16) à = pı/fı > 0 (where f; is the marginal 
product of input 1 and hence is presumably positive) conditions (17) 
amount to the plausible premise that inputs z; and z5 each satisfy the 
“law” of diminishing marginal returns, i.e., that fi; < 0 and fo» < 0. 
However, for the reasons discussed in the preceding section, in such a 
multiple-variable case the conditions (17) are not adequate for our pur- 
poses. It can be shown (though it will not be proved here) that in the 
present case the second-order conditions require 


(18) Afif? — fifafaa + fer) +foofi] < 0. 


The general expression from which this inequality is derived will be given 
in Section 7. For the moment this expression merely serves to limit the 
degree of influence of the cross-partial terms, f12 and f21. We note that if 
fig and fa, happened both to be zero, then this condition would auto- 
matically be satisfied if indeed? fi; < 0 and foo < 0. 


Having gathered our premises and carried out our maximization 
calculations (steps 1 and 2), we are now ready for the critical step in 
which we differentiate totally our first-order conditions, permitting vari- 
ation in each of our endogenous variables, x1, £2, and ^, and in the param- 


5 Moreover, if fi; > 0 and fa; > 0, condition (18) is automatically satisfied because 
then every term in the expression will be negative. Here fiz > 0 can be interpreted as a 
sort of complementarity between the two inputs—it means that increased use of one of 
the inputs increases the marginal product of the other. Thus, only in the case where 
fiz < 0, for < 0, and where these cross-partial derivatives are relatively large, i.e., the 
case of substantial substitutability (in the same sense in which “complementarity” was 
just implicitly defined), does satisfaction of condition (18) run into problems. 


* 
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eter, pı. Since our first-order conditions are composed of three equations 
we obtain, correspondingly, by this process, three total differential 
equations: 


Mii dz1 — Afie dz; — fı dX + dp, = 0 

—Mfa1 dz, — Af22 dro — fe dd =0 

—fi dz, — fe dz» = 0, 
where, since we are only permitting the parameter, p;, to vary, dp» = 0 by 
assumption. For convenience, we bring the term involving the parameter 
change, dp, over to the right-hand side and then divide all three equations 

through by —dp, to obtain 

Mii dzi/dpy + Mio dzo/dpi + fı dd/dp, = 
(19) Mor dzi/dpi + M22 dz2/dpı + fa dd/dpy 
fı dzi/dpi + f da2/dpy = 


I 
oo m. 


Taking dx,/dpi, dx2/dp2, and dA/dp; as the unknowns we can now 
treat (19) as a set of three simultaneous linear equations in three unknowns 
and use the usual methods to solve for these unknowns.” 

Our objective is to solve for dz;/dpi;, which means that we wish to 
eliminate dz2/dp; and dA/dp; from (19). The straightforward way of 
eliminating the latter is to multiply through the first two equations, 
respectively, by fz and fı and then subtract the second equation from the 
first, to yield 


(20)  A(fofuii — fif21) dzi/dpi + Fofie — fif22) dzz/dpi = fo. 
Next, from the last equation in (19) we can eliminate dz2/dp; by writing 
dx2/dp; = —(fi/f2) dxi/dp, 


which, when substituted into (20), gives us as the expression for dz,/dp1; 
which we are seeking, 


Mfafii — fif21 — Sofie — fifee)(f1/fe)] dxi/dpi = fa, 


7 The next few steps merely represent the tedious calculations needed to solve the 
simultaneous system for dz; /dp1, and the reader may prefer to go directly to the solution, 


Equation (21). 
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or, multiplying through by f» and dividing through by the expression that 
precedes 02;/dpi, 


dzi - fà 
dpi Mf3f11 — fifaia + for) + fifer] | 


(21) 


To interpret this result the reader will note first that the denominator 
in (21) is indeed the same as the expression in the second-order condition 
(18), a characteristic of comparative-statics arguments that had been 
pointed out earlier. à 

Moreover, from (18) we see at once that if the second-order conditions 


~ are satisfied, then the denominator of (21) is necessarily negative. It then 


follows at once that dz;/dp; must be negative, which is what we wanted 
to prove. That is, we have proved that the competitive firm in equilibrium 
will indeed have a derived demand curve for inputs that is negatively 
sloping—they will use less of the input when its price is higher. 


Question: Why is there no income effect in (21), the Slutsky equation 
for the firm? 


The procedures illustrated in this section constitute the essence of the 
comparative-statics methods. The remaining sections of this chapter will 
extend them in just two ways. First, the second-order conditions will be 
given for the case where there are » variables and m constraints, and, 
second, we will employ Cramer's rule to help in the solution of the simul- 
taneous equations constituted by the total differential equations of the 
analysis. The entire discussion will make fairly heavy use of determinants, 
and so the reader who is not acquainted with the elementary properties of 
determinants will either have to learn about them from any of the large 
variety of available sources, or he will have to leave the chapter at this 
point. 


7. Second-Order Conditions, Constrained n-Variable Problems: Bordered Hessians 


We now report the second-order conditions for a constrained max- 
imization or minimization problem with a multiplicity of variables. Con- 


8 See, e.g, R. G. D. Allen, Mathematical Analysis for Economists, The Macmillan 
Company, London, 1938, Chapter 18. 
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sider the problem 
Max (or Min) v = f(zi, Ze, £3, 74) 
subject to the constraints 


(22) 
g(z1, 22, 2a, 24) = 0 


h(z1, 22, 3, x4) = 0. 


For explicitness, we deal with this case involving four variables and two 
constraints, but everything said about this case is extendable directly to 
cases involving n variables and m constraints for any positive integers, 
n and m, provided? m « n. 

In that case, we have the Lagrangian 


L = f(-) — ag(-) — Bk), 


where we use f(- ) to denote f(x1, £2, £3, 4), etc. 

To state the second-order conditions, we need a number of definitions. 
Writing, as usual, L;; for 9?L/óz; 8x; and g; for dg/dzx;, etc., we define the 
bordered Hessian determinant of the system (22) as 


Li Li Lis Lu —g1 —h 

Loi Lee Les Les —g2 —he 

Lsi L32 Las L34 —g3 —hs 
23) n= ‘ 
(23) Lii Lao Laz Las —ga —ha 
SH et te 0 
—h, —he —h3 —h4 0 0 


This determinant is composed of all of the second-partial and cross-partial 
derivatives of the Lagrangian, L, bordered by the last two rows and 
columns of the partial derivatives of the constraints. 

We define next the first lower-order principal minors, H;;, as the 
minors of H obtained by deleting the ith row and the ith column of H, 
i.e., as H11, Ho», H33, and H44, where, for example, 


La» Les Lee  —ga —he 
L32 Las Las —g3 —hs 
Ay, =|La2 Las Lag / —g4 —hal, 
ia 98: E UAR 0 
—he —ha —h4 0 0 


9 If m = n, the system has as many constraints as unknowns, and, provided they 
are independent, these can determine the values of the variables, leaving nothing to be 
maximized. If n < m, we have more equations than unknowns, which may well not even 
have a consistent solution. 
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Li Lis Lis —gı —hi 
Lai L33 L34 —g3 —hs 
Hoz = |La Las Lag  —94 —haj|. 
—91 —93 g" 0 0 
—h, —hs —he 0 0 


The reader may wish to write out the minors H33 and H44 for himself. 

Similarly, we define the next lower-order principal minors H;;;; of the 
Hessian, H, as the minor obtained by deleting the ith and jth rows and 
the ith and jth columns of H. These fourth-order minors include H1212, 
H 1313, H2323, H2424, and H3434, where, for example, 


Lass L34 —g3 —hs 
Las Las /|—g4 —ha 
—g3 —gs 0 on” 
—hs —h4 0 0 


Hizi2— 


In the same way we can go on defining principal minors of the Hessian 
of successively lower orders. 

We can now state the following two propositions for which we will 
attempt no proof 1°: . 


Proposition 1: The second-order conditions for a constrained min- 
imization problem such as (22) required the bordered Hessian and all of its 
principal minors to be positive. 


Proposition 2: The second-order conditions for maximization of a 
system such as (22) require that its bordered Hessian be negative if the 
number of its variables (other than Lagrange multipliers) is odd and that 
it be positive if the number of those variables is even. Moreover, if H is 
positive, its first-lower-order minors, H;; must all be negative, its next- 
lower-order minors, Hij; must all be positive, etc. Similarly, if H is 
negative, the H,; must be positive, the H;;;; must all be negative, etc. In 
sum, the H, Hy, Hiji; etc., must alternate in sign. 


In our example, since we have an even number of variables (the four 
z's), if we are maximizing we therefore require 
H»0 
Ha <0, Hoe «0, Has <9, Has <0 
His12>0, Hisis>0, Hina 0, H2323 > 0, ete. 


10 For a derivation, see, e.g., P. A. Samuelson, Foundations of Economic Analysis, 
Harvard University Press, Cambridge, Mass., 1948, Appendix A. 
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These conditions are used throughout the standard comparative- 
statics analyses. In particular, they make use of the requirement that in a 
minimization problem H and its principal minors must all have the same 
sign, while in a maximization problem they must alternate in sign. 


PROBLEMS 


1. Write out H2323 for H as given in (23). 
2. Show that condition (18) of the previous section follows from Proposition 1 
applied to the systems (13) and (14). 


8. Illustration Ill: The Slutsky Theorem for the Consumer 


We now derive the Slutsky theorem for the consumer who maximizes 
the utility he obtains from the consumption of n commodities, in quan- 
tities £1, * ` * , Zn, subject to his budget constraint. We want to show that, 
after elimination of the income effect,óz;/0p; < 0.The consumer seeks to 


Max u = u(zi, 7-7, tn) 
subject to 
È pti =m, 

whose Lagrangian is 
(24) L=u(-) +m — E pa). 
Differentiating in turn with respect to z;,---, tn, \ we obtain the first- 
order conditions 

"ui — Api = 0 
(25) Ur — M. = 0 

—pi3i —5:: — pats +t m = 0. 


This time we will vary two parameters, p; and m, in order to be able to 
determine dz;/dm and dz,/dp, so that by comparison of the two we can 
separate out the income and substitution effects in the latter. 

Thus, differentiating totally each equation (25) we obtain 


u41 dx, + Ug dto +*+- + Uin dz, — pı dh = A dpi 
ug1 dry + U22 dza +- - - + uos dEn — pa dA = 0 
(2G) Me eie csiacetnsà nto ei mereri e Se eR diesen 
Un, dz, + Ung dza d- +++ Unn dEn — ps dd = 0 
—pi dz, — pa dz —-+* — Pn dz, = zı dp; — dm, 
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where for convenience we have brought over to the right-hand sides of 
these equations all terms involving dm and dp;, the changes in the param- 
eter values. 

Next, to see the effect of a change in the consumer's budget alone on 
his purchases of zı, let us separate out the role of dm and dp; by first 
taking dm 0, dp, = 0. Presently, we will reverse these premises to 
examine the effect of a change in the price, pı, on his purchases of zi. 

We may now use Cramer’s!! rule to solve the system of linear equations 
(26) in the variables dz;, dz2,---, dz, d^, obtaining from that rule 


0 "Mig 8 Se the PL 
0 digg “2 =) dg. ps 
(27) dzy = [reer reece terete cette ees /H, 
0 Ung 700 Unn — Dn 
—dm —ps —p 0 


where the reader should verify that the denominator of (27), which by 
Cramer’s rule is the determinant of (26), is indeed the Hessian, H, of our 
maximization problem. 


11 Cramer’s rule gives us the solution in terms of determinants of a system of 
simultaneous linear equations. For example, suppose we are given the pair of equations 
2z; + 32 = 4 and 5x + 6x2 = 7. Then the determinant of the system is 


2 3 


5 6 =2xX6-3X5= —3. 


D= | 
Cramer's rule states that provided D # 0 (as is true in our example) then 


24 
5 DES 


: jiu MA 


More generally, in the system of simultaneous linear equations 
Outi +--+ Gintn = bi 


We find the value of a variable z; from the formula z; = A/D, where D is the deter- 
minant of the system 


and A is another determinant obtained from D by replacing the ith column in D with 
the column of constants, bi,---, bn- 
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Expanding the determinant of (27) in terms of the elements of its 
first column!? we obtain 


(28) dz, = —dmHn411/H or dz,/dm = —Hn41,1/H, 


where H,41,1 is the cofactor (the signed minor) of the element in the 
(n + 1)st row and the 1st column of H. We will see shortly that we know 
very little about the nature of the expressions in (28). However, it will 
help us a few paragraphs later to separate out the income effect. 

First, however, we must return to the total differential equations (26) 
to determine the consequences of a price change. This time, therefore, we 
take dp; ~ 0, dm = 0. Again using Cramer's rule to solve for dz; we 
obtain this time, instead of (27), 


Adpi uiz Uin —DP1 
0 Usa 7*** "a. —Pa 
(29) dieron. ale Sedet. sus et /H, 
0 Ung > Unn — Dn 
zidpi —pa co^ —Dn 0 


or expanding in terms of the first column and using (28), 
(30) dz,/dpi = \H11/H + ziHs41,:1/H = MH 3/H — x, dzi/dm. 


We will show next that —z, dz,/dm, the second term in (30), is the 
income effect of the change in price, i.e., that it is (dz1/dm) (9m/9p1), so 
that the remaining term is the substitution effect. For this purpose first 
note that if pı changes (e.g., it rises), then the resulting fall in purchasing 
power (the “compensating variation" in income) will be dm/dp; = —zi. 
If a person is-purchasing, say, 7 shirts, and shirts rise in price by one 
dollar, then he will have lost 7 dollars in purchasing power, i.e., he will 


12 Jt will be recalled that when we expand a determinant such as 


Gu Gi Gi3 
O23 G22 G23 
G31 G32 G33 


A= 


in terms of its first column we obtain 


01; (Gi 
a32 33 


232 023 
G33 033 


ia Gia 
G33 


ü51 
+ an |an 


dii — an -X(-0'"an4a, 


Where A; is the minor of the determinant A obtained by eliminating its ith row and 
first column. The same rule permits us to expand A in terms of any other column or in 
terms of any of its rows. The minor Ay multiplied by (—1)**/ is called the cofactor 


of ag. 
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need 7 additional dollars to be able to purchase the same number of shirts 
as before. Similarly, if he is purchasing z, units of commodity 1 and the 
price of that commodity rises by a dollar, his loss on real income can be 
taken to be —z; = àm/óp;. Substituting this result into the last term of 
(30) we obtain 


(31) —2, dz,/dm = (dz,/dm) (0m/8p|). 


This shows that (31) is indeed the income effect, i.e., it is the portion of 
the effect on z; of the rise in price that is transmitted via the effect of dp; 
on the purchasing power of m. 

This means that the remaining term in (30) is the substitution effect 
whose sign it is our purpose to determine. We have by (25) ^ = u1/pı > 0 
and by the second-order conditions (Proposition 2 of the preceding section) 
that H,, and H are of opposite signs. Hence we deduce, at last, that 


(32) the substitution-effect term = AH1;1/H < 0. 


This is the Slutsky theorem for the consumer, which we have sought 
to prove. It states that after elimination of the income effect (31) then 
ax,/dp, as given by (32) will always be negative. 

The role played by the income effect in our analysis should be noted. 
We really have no information about the cofactor, Hm41,1, in (31), and 
so we can draw no general conclusions about the nature of the income 
effect. We can only say it may sometimes be negative (and call it “the 
inferior-goods” case) and sometimes positive (calling this the case of 
“normal” goods). But, fundamentally, the income term just serves as the 
unexplored component of (30), so that only after it is removed can we make 
useful statements about the remainder. The discovery that one can say a 
great deal about the remainder after eliminating the income term is a major . 
contribution of Slutsky, Hicks, and Allen. 


9. Illustration IV: The Linder Theorem 


We end our display of comparative-statics problems with one which is 
simpler to follow than that of the preceding section and which to many 
will be much more interesting than the Slutsky theorem. After all, with 
all that effort, we have merely proved that under suitable restrictions the 
demand curve will have a negative slope, and one may easily wonder 
whether one might not have accepted that conclusion simply on intuitive 
grounds without all of the painstaking calculations which we have just 
gone through. Certainly, that result is not likely to be a major surprise to 
anyone. 
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A comparative-statics theorem which is, perhaps, rather more sur- 
prising has been provided by Staffan Burenstam-Linder in his fascinating 
book, The Harried Leisure Class.1* He has challenged the conventional 
view that the great problem of the twentieth century will be excessive 
leisure with which man will be unable to cope. On the contrary, the sub- 
stitution effect of rising hourly incomes always favors the purchase of 
goods whose consumption requires relatively little time (note that, once 
again, the ambiguity inherent in the income effect must first be removed 
before we can arrive at such an unqualified conclusion). 

Specifically, the theorem we will prove asserts that, because the 
consumption of commodities requires time as well as money, the substitu- 
tion effect of a rise in a person’s real wages will always work to decrease. 
the consumption of a commodity whose ratio of consumption time to 
price is relatively high. 

In other words, as his rate of real earnings increases, the consumer 
will be driven to purchase commodities that, although more costly in 
money terms, conserve his increasingly scarce resource: time. To derive 
this result, we use a model that divides the economy into two sectors. We 
use the following notation: 


zı = quantity purchased of the commodity under study, 
z = quantity of "all other goods" consumed, 
z3 = quantity of labor time spent earning income, 
Pı, P2 = prices of commodities 1 and 2, respectively, 
w = wage rate, 
tı, tg = consumption time expended per unit of commodities 1 and 2, and 
m = nonwage income (if any). 


We will show that, neglecting income effects, dz,/dw > 0 if, and only 
if, t;/p1 < te/pe (that is, if commodity 1 has an unusually low ratio of con- 
sumption time to price). The consumer’s objective is to maximize his utility: 

u(z1, 22, 23) 


subject to his budget and time-availability constraints: 


piZid- pora = m+ wr 
£121 + tere + T3 = t. 


13 Staffan Burenstam-Linder, The Harried Leisure Class, Columbia University Press, 
New York, 1970, pp. 150-152. 
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We use the standard comparative-statics procedure to find our desired 
expression for àz;/9w. Our Lagrangian is 


L = u(21, t2, 23) + A(m + wza — pit — p222) + u(t — tz1 — tote — T3), 
with the first-order conditions 


Uy, — Api — pli = 0 

uz — Ape — ui; = 0 

(83) uz3+d\w—p=0 
m + wzs — piti — P2ot2 = 0 

t — tz — tote — z3 = 0. 


Next, we set equal to zero the total differentials of our first-order con- 
ditions, which the reader may wish to write out as an exercise. Letting H 
represent the determinant of the system, we have, by Cramer’s rule and 
expanding the numerator determinant in terms of its first row, 


0 up uis —pi —h 
0 o2  u23 —p2 —le 
(34) dz, = —) dw u32 U33 w —1|/H 
—zsdw — dm —ps w 0 0 
0 —tz —1 0 0 


= —) dwH31/H — (x3 dw + dm)Ha41/H. 


We can use precisely the same procedure as that in the previous 
section to show that the last term in (34) is now our income effect. This 
leaves us with the substitution effect 


uiz? Uis —pi —tı 


s — _»| 422 "ss —P2 —la 
AH31/H = —À um Weg o VE 


NE qp 0 
which can be shown by expansion of this last determinant to equal 
(35) —3(pa + wt;)(pits — poti)/H. 
Now it can be argued that we must assume à > 0. The argument, in- 


cidentally, illustrates clearly the use of first-order conditions in obtaining 
qualitative results in a comparative-statics analysis. Substituting from the 
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third equation into the first equation of first-order conditions (33) we 
eliminate u and obtain 


ui — Api — Usty — wt, = 0 
or 
A = (ui — usti)/(p1 + wt) > 0 


since presumably u, > 0, us < 0 (marginal utility of commodity 1 is 
positive; marginal utility of labor, us, is negative). 

Moreover, by the second-order conditions (Proposition 2) H < 0. 
Then since all the other terms in (35) are positive, we see that the sub- 
stitution effect will be positive if, and only if, pito — pot, > 0, i.e., 
t,/p1 < t2/p2 (that is, if the ratio of time-cost to price for commodity 1 
is less than that of “all other goods"). This is our desired result. 

The intuitive explanation of this result rests in what is, in effect, a rising 
cost of time. Because the amount of time available to an individual is fixed, 
it becomes increasingly scarce (and hence, expensive) relative to the ex- 
panding quantities of commodities that can be purchased with an ever- 
rising income (as well as the rising wages that can be earned in each hour). 
Those consumption activities that are time intensive then become corre- 
spondingly less attractive. To paraphrase Linder, as the individual's time: 
becomes more valuable he is driven to seek to spend his money “more 
efficiently," that is, more quickly.!* 


REFERENCES 


Silberberg, E., “A Revision of Comparative Statics Methodology," Journal of 
Economic Theory, vol. 7, No. 2, 1974 (mathematical treatment, but not too 
difficult). 


14 Qur discussion has dealt only with the substitution effect. As usual, the income 
effect can work either way. Where leisure is not an inferior good, the income effect will 
make for a secular rise in its demand, offsetting the substitution effect, at least in part. 
However, casual observation suggests that precisely those individuals with the educa- 
tional background and occupations associated with attendance at theatrical performances 
and the utilization of museums and libraries are the persons who have not-demanded 
more free time as their incomes have risen—if anything, they have tended to grow 
increasingly busy at their “responsible” jobs. 
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Ordinalism was only a first step away from an analysis based on 
introspective utility. Even indifference maps and subjective rankings are 
not observable directly, and they certainly do not lend themselves to 
statistical estimation. In recent years two new structures for consumer 
analysis that move us a long step toward direct observability have become 
available. The first of these, the revealed preference analysis, was designed 
almost entirely by Samuelson,’ with the finishing touches to the analysis 
contributed by Houthakker.? Samuelson also made major contributions 
to the second of these innovative constructs, the use of expenditure func- 
tions as a substitute for utility analysis, whose full formalization must, 
however, be attributed to the earlier work of R. Roy.? 

This chapter provides an introduction to both these approaches. The 
second of them, which is also described as the duality analysis of consumer 
behavior, has its analogous counterpart in the theory of production and 
the decision-making of the firm, where the central source is the work of 
Shephard.* Some of the most suggestive applications of duality theory 


1 P. A. Samuelson, Foundations of Economic Analysis, Harvard University Press, 
Cambridge, Mass., 1948, Chapter VI. 

2 H. S. Houthakker, “Revealed Preference and the Utility Function," Economica, 
Vol. XVII, May 1950. 

? See R. Roy, De L'Utilité: Contribution a la Théorie des Choiz, Paris, 1942. 

* Ronald Shephard, Cost and Profit Functions, Princeton University Press, Princeton, 
N.J., 1953, 1970. 
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occur in production analysis, and so the discussion concludes with an 
examination of the application of duality analysis to production theory, 
which should also cast some light on our discussion of expenditure functions. 


1. The Revealed Preference Model 


The revealed preference analysis undertakes to reconstruct the theory 
of the consumer on the basis of concepts which, at least in principle, do 
not require the consumer to supply any information about himself. If his 
tastes do not change, observation of his market behavior can, conceptually, 
supply all the requisite data. For this purpose we need merely record what 
combinations of commodities he buys at different prices. Given enough 
such information, it is even theoretically possible to reconstruct the 
consumer's indifference map, as we shall see. 

The entire revealed preference analysis is based on a rather simple idea. 
A consumer will decide to buy some particular set of items either because 
he likes them more than the other goods that are available to him or 
because they happen to be cheap. Suppose we observe that of two collec- 
‘tions of commodities offered for sale the consumer chooses to buy A rather 
than B. We are, then, not entitled to conclude that he prefers A to B, 
because it is also possible that his decision just reflects the fact (if it is a 
fact) that A is the cheaper collection and he may even regret not buying 
B. But price information may be able to remove this uncertainty. If their 
price tags tell us that A is not cheaper than B, then there is only one 
plausible explanation of the consumer's choice—he bought A because he 
likes it better. More generally, we have the 


Definition: If a consumer buys some collection of goods A, rather than 
the available collections B, C, D, ete., and it turns out that none of the 
latter is more expensive than A, we say that A has been revealed preferred 
to the others (or that the others have been revealed to be inferior to A).° 

The complete set of combinations which are revealed inferior to A by 
one purchase can be found with the aid of the price line. In Figure 1a, 
let A represent the collection of commodities which is bought when the 


5 Let po = (Pia; P2a;***; Pno) be the set of prices at which the individual buys 
collection ta = (210)***, Zna) and spurns another collection zy = (zi: ::, ze). Then 
Za is said to be revealed preferred to 2» if it is at least as expensive as z, at the prices 
Pa at which Ta is purchased. That is, Za is revealed preferred to zs if 


r n 
à Diafié = pH DioTib, 


where the right-hand sum represents the cost of collection z, at the prices p, at which 
in fact z, was purchased. 


— | 


price line is PP’. By definition, any other point on PP’, such as B, is just 
as expensive as A. Moreover, since every point, such as D, which is below 
and to the left of the price line, represents smaller amounts of both com- 
modities than do some points on PP’, it follows that such lower points are 
cheaper than A. Therefore, because the consumer bought A rather than 
any of these other collections that were no more expensive, it follows that 
every point on or below PP’ is revealed inferior to A. Finally, since it 
should be clear that any point above PP’ is more expensive than A, we 
see that none of these can be revealed inferior to A by the consumer’s 
purchase of A. 

We can now state the basic assumption of the. theory, called the weak 
assumption of revealed preference. This asserts that® 


5 In terms of the notation of footnote 1, this premise asserts the following: Suppose 
at prices pa some collection za would be purchased, while at some other set of prices, 
Ps, £y would be bought. Then if 

È pora > Y pora 


80 that za is revealed preferred to zs, then we can never have Z», revealed preferred to 
Za. That is, at the prices p at which zy is bought we must never have 


X paza 2 Y pati. 


E" 
[m 
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The consumer will never behave in a manner which is so inconsistent that 
some collection A will be revealed preferred to B and that B will simul- 
taneously be revealed preferred to A. 


Violation of this assumption must involve the consumer’s buying A 
if it were the more expensive and then being induced by a relative rise 
in B’s price to above the price of A to switch his allegiance to B! A Cadillac 
buyer who could be induced to switch to a Chevrolet by a rise in its price 
to $50,000 would violate the weak revealed preference assumption. We 
would not normally expect consumers to behave in this apparently peculiar 
manner. However, both snob appeal and the judging of quality by price 
can, clearly, be inconsistent with this weak revealed preference assumption.” 

We shall also employ a second assumption: 


Given any collection of goods, the consumer can be induced to buy it if its 
price is made sufficiently attractive, i.e., for any point in Figure 1a, there 
exists some price line involving positive prices and a positive income level 
which will lead the consumer to buy it. 


The assumptions are all that is needed to derive any of the standard 
results of the theory of consumer behavior (with the exception of in- 
tegrability, as noted in an earlier footnote). 

As an illustration let us see how revealed preference theory can be used 
to prove the Slutsky theorem, which states that if the income effect is 
ignored, the demand curve must have a negative slope. Although a two- 
dimensional diagram is employed for expository purposes, every step 
of the argument carries over to a situation involving any number of 
commodities. 

In Figure 1a let A represent the combination of commodities bought 
when the price line is PP’. We want to show, once again, that a fall in the 
price of commodity Z from PP’ will increase (not decrease) purchases of Z 


7 The strong assumption of revealed preference is a sort of transitivity extension of 
the weak assumption. It asserts that if collection Za is revealed preferred to 2s, if zy is 
in turn revealed preferred to z.,---, and Ty is revealed preferred to z,, then no set of 
prices will reveal z. preferred to Te. Houthakker showed that this premise is required 
to guarantee the existence of a utility function that is consistent with any given in- 
difference map—the so-called problem of integrability. Given any ordinary indifference 
map in two dimensions it is always possible to construct many utility surfaces consistent 
with these preferences, but the same is not necessarily true in an n-variable case. 
However, it is true if the consumer’s preferences satisfy the strong axiom of revealed 
preference. (The term integrability occurs here because indifference curves can be 
described by the differential equations, du = 0, where wis the consumer's utility, and 
integration of these equations, if it is possible, yields the equation of the corresponding 
utility surfaces.) 


E 


Part 2 Towards Observability 347 


if we consider only the substitution effect. For this purpose, we insert the 
imaginary price line RR’ which passes through point A, so that the con- 
sumer’s real income remains constant in the sense that he can still just 
make his original purchase at the new prices, if he wishes to do so. RR’ is 
flatter than PP’ because Z has, by hypothesis, fallen in price. We want to 
prove that the new equilibrium point on RR’ (if it is different from A) 
must be a point like E, which lies to the right of A (an increased demand 
for Z). To prove that this must be so, we show that any point on RR’, 
such as D, which lies to the left of A, is ruled out by the weak revealed 
preference assumption. We know that, since D lies below PP’, A is revealed 
to be preferred to D. But if D were chosen when the price line was RR’, 
then since A is no more expensive than D at those prices (they lie on the 
same price line), D would be revealed preferred to A. Hence A would be 
revealed preferred to D and vice versa, which is precisely what the weak 
revealed preference assumption prohibits. Thus, no point on RR’ which, 
like D, lies to the left of A can be chosen. The substitution effect of a fall 
in the price of Z will generally increase the demand for Z (or at least not 
decrease it), as was to be proved. 


2. Revealed Preference and the Slutsky Theorem in n Variables 


The nature of revealed preference analysis in n commodity problems is 
easily illustrated by an explicit derivation of the Slutsky theorem in n 
variables. 

Suppose our consumer would purchase quantities T1, - - - , z4 of these 
commodities at prices pi, ** * , pa. However, if the price of good 1 were 
replaced by pı + Api, all other prices held constant, and the consumer’s 
income were changed so as to leave him on the same indifference curve (i.e., 
the income effect were removed), his purchases would change to some 
quantities xı + Azy,--+, Tn + Az, and the increments Az, - - - , Az, must 
represent the substitution effects of the price change, pı. The Slutsky 
theorem asserts that the sign of Az; will be the opposite of that of Ap, 
that is, if the commodity’s price rises, the substitution effect is a decrease 
in purchase of that item and vice versa. 

Now 2,,:--, % and z; + Az,---,2,-+ Az, are indifferent because 
the income effect has been removed. Thus, the former cannot be revealed 
preferred to the other; at the prices p;,---, p, at which the former is 
purchased it cannot be as expensive or more expensive than the latter. 
That is, 


Didi + pote + ``- + Pata <-pi(ei + Axi) + pa(za + Are) 
+++ Palën + An). 
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Similarly, because they are indifferent, the second collection of goods 
z; + Az; cannot be revealed preferred to the first so that at the prices 
(pı + Api, P2 ` `` , Pn) at which that collection is bought 


(pi + Api)z1 + pate +++ + pata > (pı + Api) (21 + Azi) 
+ pa(za + Are) 
+++++ pans + Atn) 


Subtracting the first of these inequalities from the second we obtain at 
once 
Apizi > Api(zi + 21) 
or 
0 > Ap; Azi, 


which is our Slutsky theorem. 
PROBLEM 


1. Provide a different derivation of the Slutsky theorem from the revealed 
preference assumption using the alternative definition of removal of the 
income effect under which the consumer’s money income, is adjusted so that 
after the price change he can just purchase the initial collection of goods. 
That is, 


(pı + Apiyzi + pave +--+ paza = (pit Api) (1 + Azi) + pa(22 + A22) 
Teo pa(@n+ Azn). 


3. Revealed Preference and the Indifference Map 


The revealed ‘preference assumptions also permit us, in principle, to 
construct the consumer’s indifference map on the basis of enough observa- 
tions on his market behavior. Going back to Figure 1a, suppose this time 
that B is observed to be the combination which is chosen by the consumer 
when the price line is PP’, and let us try to find the indifference curve 
through point B. We already know from our first observation that B is 
revealed preferred to every point on or below PP’. Moreover, it is easily 
shown that every point such as M, which lies in the region above and to 
the right of point B (the shaded region above KBL), is revealed preferred 
to B. It is, of course, highly plausible that M is preferred to B, for M con- 


tains more of one or both commodities than does B (it is above and to the 


right of B).* 


8 M is revealed preferred to B by the fact that it is always at least as expensive as 
B since it contains more of at least one commodity (and no less of either good). And 
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It follows that the remainder of the indifference curve through B must 
lie below area KBL and above price line PP’, i.e., that it must lie some- 
where in the unshaded region above line PP’. This proves at once that, at 
least near B, the indifference curve must have a negative slope (otherwise 
it would enter area KBL) and that it must be convex to the origin (it 
must be above PP’ both to the right and to the left of B). Since this argu- 
ment can be repeated for any other point in the diagram, we see how the 
revealed preference theory can be used to prove that all indifference curves 
must be of negative slope and convex to the origin throughout their length. 

However, we still have quite a way to go before we find the precise 
shape of the indifference curve through B, since all we have seen so far is 
that it can lie anywhere in the unshaded region above line PP’ (which has 
been called the zone of ignorance). But further observations of the con- 
sumer’s behavior can, as will be shown now, permit us to extend the shaded 
regions by chipping away at the zone of ignorance, and thus to get closer 
and closer to finding the precise location of the indifference curve chrough B. 

First let us see how we can extend the region OPP’, which is revealed 
inferior to B. Consider any point other than B on PP’, e.g., point A 
(Figure 1a), which has, therefore, been revealed inferior to B. By the second 
assumption of revealed preference theory, there is some price line, RR’, 
which will lead the consumer to purchase A. We find RR’ by watching the 
consumer.and recording his income and the prices he pays when we see him 
buy A. Any point on or below RR’ is now revealed inferior to A, and since 
A has, in turn, been revealed inferior to B, everything on or below RR’ is 
revealed inferior to B.? Thus triangle AP’R’ is revealed inferior to B—it 
has been chopped off from the region of ignorance. We can repeat this pro- 
cedure as many times as we wish. For example, we can take any other 
point, such as F, on PP’ (Figure 1b), find its price line UU’, and thereby 
show that triangle PFU is revealed inferior to B and thus remove this 
triangle from our zone of ignorance. Or we can take a point on one of the 
added price lines, such as point E on RR’, and observe the price line TT’ 
at which E is bought. Since every point on or below TT” is revealed to be 
inferior to E, and E is inferior to A, which is, in turn, inferior to B, all of 
these points are revealed inferior to B. Hence, triangle R'ET' is now 
removed from the zone of ignorance, etc. In this way we can go on chopping 
away at the underbelly of the zone of ignorance indefinitely, getting closer 
and closer to the indifference curve through point B which we seek. 


geL 
since, by the second assumption of revealed preference theory, some positive prices can 
induce the consumer to buy the more expensive collection, M, those prices must reveal 
that he prefers M to B. 

? Note that this argument sneaks in an assumption of transitivity. If E is revealed 
inferior to A and A is revealed inferior to B, we assume that E is thereby revealed 
inferior to B. 
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Moreover, the upper portion of the zone of ignorance can also be 
hacked away bit by bit. Thus, in Figure 1c draw any new price line, VV’, 
through B. We observe the consumer when prices and his income happen 
to correspond to budget line VV’. Let W be the point which is chosen with 
these prices and income. At these prices B is no more expensive than W, 
so that W (and, consequently, all of the region above and to the right of 
W) is revealed to be preferred to B. This procedure can be repeated with 
other price lines through point B, each of which yields a point like W 
that is revealed to be preferred to B. The locus of all such points, the 
curve XX’ in Figure 1d, and all points above and to the right of XX’ are, 
then, revealed preferred to B. XX’ is called the offer curve through point 
B.19 We can chop away still more of the zone of ignorance by choosing 
any point W on offer curve XX’, observing what the consumer buys with 
various price lines through W, and so constructing the offer curve YY’ 
through W. Since any point on or above YY’. is revealed preferred to W, 
which is in turn preferred to B, these points are all shown to be preferred 
to B. Proceeding as long as we wish in this way, we can narrow down the 
region of possible location of the indifference curve through B (the zone 
of ignorance) as far as we like. 

Unfortunately, the proof that the upper and lower chopping-away 
sequences converge, and so exactly narrow the zone of ignorance down to a 
single indifference curve, is rather difficult and involves more advanced 
theorems in differential equations.!! However, the basic idea of the revealed 
preference approach to indifference curve construction should be clear 
from the foregoing discussion. 


4. Revealed Preference and Index Numbers of Real Income 


An index number formula for the measurement of real income under- 
takes to employ price and quantity information for each of two periods 
and to determine on the basis of these data alone whether real income has 
risen, fallen, or remained unchanged. One of the major difficulties in the 
construction of an index number formula lies in the problem of evaluating a 
real income change which involves many individuals, since i& may be an 
improvement from the point of view of some people but an unfortunate 
development in the opinion of some others. But even though we will deal 
with only one person in order to evade this problem, we will see that the 
construction of an index number formula still runs into fundamental 


difficulties. 


10 That is because XX’ shows the various commodity combinations, such as W, 
which the consumer will offer to buy at different relative prices (different price lines) 
any of which enable the consumer to buy combination B with no money left over. 


11 See H. S. Houthakker, op. cit. 
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Suppose that the consumer receives some collection of goods, B, in 
one period and some other collection, Q, in the next. In some ultimate 
psychological sense we can say that his real income will have risen if and 
only if Q lies above his indifference curve through B; his real income is 
unchanged if and only if B and Q are on the same indifference curve, and 
his real income will have fallen if point Q lies below the indifference curve 
which passes through point B. Thus, to accomplish its purpose the index 
number formula must somehow be able to indicate on the basis of two sets 
of price-quantity observations where the indifference curve through B 
lies in relation to point Q. 

But we have just seen that such a small amount of price and quantity 
information must leave us with a considerable zone of ignorance as to the 
location of any indifference curve. Hence it is impossible to design any 
index number formula which always tells us whether real income has risen, 
fallen, or remained unchanged. In some cases, depending on the prices or 
quantities involved, it is possible to determine whether B or Q represents 
the larger income. For example, in Figure 1a if PP' represents the price 
situation when B is purchased and then if Q lies below PP', we know that 
B is revealed preferred to Q so that the change from B to Q represents & 
fall in real income. Similarly, if Q lies in region KBL, we know that real 
income must have risen. But if Q lies in the unshaded zone of ignorance, we 
may well lack information sufficient to determine what has happened to 
real income, and no formula can supply these missing data. 

There are two cases in which we can be sure of what has happened to 
real income: If Q lies below B's price line, real income must have fallen, 
whereas if B lies below the price line when Q is purchased, so that Q is 
revealed preferred to B, then real income has risen. That is as far as the 
data will carry us—if neither of these situations happens to hold, no inc 
number formula can determine what has really happened to real income. 

Yet any one of the standard index number formulas is set; up as a test 
of the direction of change of real income. The price and quantity data for 
points B and Q are inserted into the formula, and if the resulting index 
number turns out to be greater than 100, real income has allegedly risen; 
if it is equal to 100, it is supposed to be unchanged; and so on. It is natural 
to ask about the basis on which these judgments are made when the re- 
quired indifference curve information is not available. The answer is that 
any index number formula implicitly sets up an imaginary and arbitrary 
indifference map and then treats it as though it were the consumer’s true 
indifference map, using this arbitrary map to determine what has happened 
to his real income. Of course, if the individual’s true indifference map 
differs from the artificial map implicit in the index number formula, the 


index number may well imply that real income has gone hen i 
in fact, decreased, and vice versa. : HEISE 


352 Towards Observability Chapter 14 


This can be illustrated by a brief analysis of that index number of real 
income which uses base-period prices as weights (the Laspeyres index). Let 
us, for simplicity, suppose there are only our two commodities, C and Z, 
and let ps. and ps, be their respective base-period prices. If the quantities 
held by the consumer in the base period were c; and z and if c and z are 
his current possessions of the commodities, the expression for the Laspeyres 
index of current real income is 


Poot + PozZ 
Poel + pozzb 


i.e., the value of current purchases c and z, at base-year prices (pocc + PozZ) 
divided by the actual base-year expenditure on the two commodities 
(poets + pos»), all multiplied by 100. Suppose, then, that we know the 
four base-year numbers Prc, Poz, Co, and zp and that we want to find which 
possible combinations of c and z will, according to this expression, leave 
our consumer’s real income unchanged. Income will remain constant on 
this Laspeyres index calculation whenever we have 


Pool + PozZ 
g Polo + DozZb 1m, 
where the reader should remember that the four base-year magnitudes 
Poe» Poz: C, and zy are given, fixed numbers, not variables. Now divide both 
sides of the equation by 100 to cancel it out, and use mẹ to designate the 
given (constant) total base-year expenditure on both commodities together, 
the denominator in (1): The indifference curve equation (1) then becomes 


Poet + PozZ _ 1 
mp * 
that is, 


(2) Doce + Poet = Mp. 


Equation (2) is just another version of the formula (1) for any combination 
of goods C and Z which the Laspeyres index considers indifferent with that 
of the base year (real income unchanged). It is the equation of a Laspeyres 
index indifference curve. But the reader will note that (2) is the equation of 
the price line with base-period prices ps. and pèz and base-period income 
(expenditure) my. In other words, the Laspeyres indifference curve is the 
base-period price line! Thus it is the very lowest edge (PP’) of the zone of 
ignorance (Figure la)—any points below it are necessarily revealed 
inferior to the base-period point. 
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This result helps us to evaluate the Laspeyres index number, for we 
know that the true indifference curve through point B (which represents 
the combination of C and Z consumed during the base period) must lie 
above this price line PP’ (Figure 2). We see that if the point representing 
current consumption is located below PP’ (point Q) so that the Laspeyres 
index number says real income has fallen, this must actually be the case. 
But if the Laspeyres index number indicates that real income has risen, it 
may (point Q") or may not (point Q^) in fact have done so (for point Q" 
also lies above the true indifference curve but point Q’ lies below it). In 
sum, the Laspeyres indifference curve may charitably be considered tie 
lowest possible curve in the zone of ignorance, and whenever it is wrong it 
must overvalue current real income. In other words, it is the most sanguine 
of all admissible indices of real income, since any index which is nore 
biased in this direction (e.g., if it says Q is also better than B) must imply 
that the consumer's base-period indifference curve actually cuts below the 
base-period price line, into the region which is revealed inferior to B! 

Although it is possible to conduct a similar analysis of the concealed 
implications of any other index number formula, such an investigation is 
usually somewhat more difficult than that of the Laspeyres case. 


UNKNOWN TRUE 
INDIFFERENCE 
CURVE 


Figure 2 


5. Duality and the Theory of the Producer and Consumer 


We turn now to the second of the major reformulation 
analysis: duality or expenditure-function theory. 
The last few years have witnessed a substantial flow of writings on this 


subject, which (like revealed preference theory) is designed in some sense 
to constitute a reformulation containi 


S of consumer 


own and offers many analytic ad 
immediate empirical application. 


The newer analysis represents a dual approach to the more conventional 
l 


analyses both litera! ly and in spirit. It may be recalled from the discussion 


354 Towards Observability - Chapter 14 


of Chapter 6 that when production decisions were interpreted as “the 
primal problem" the analysis was conducted in terms of activity (physical 
output) levels, whereas the dual problem instead emphasized money 
values. In an economic model in which physical quantities (e.g., consümer 
demands or outputs) can be taken to be determined by prices or prices by 
physical quantities one has a choice over which of these two types of 
variable to use in formulating the analysis. At least in principle, the 
business firm can choose how much it hopes to sell (and from this it can 
deduce a selling price), or it can instead decide how much it wants to 
charge for its product (and then estimate how much it will be able to sell 
at each price). In exactly the same way, the theorist can interchange the 
role he assigns in his analysis to the "real" and the pecuniary variables. 
This is precisely the changeover that occurs as one proceeds from the 
conventional formulation described in earlier chapters of the book to the 
dual analysis. 

As we will see, one of the immediate fruits of this changeover is the 
replacement of relatively abstract concepts such as utility and production 
functions, which are not only difficult to deal with statistically but are 
even more difficult to explain to practitioners. Their place is taken by 
expenditure functions (indicating how a consumer’s outlays vary with the 
amounts he purchases), by cost functions, and by profit functions, all of 
which are easily understood, conceptually, and all of which are expressed 
in terms of variables and parameters for which empirical data are more 
readily available. , 

It should be emphasized that the equivalence in principle between 
duality theory and the conventional analysis is no shortcoming of the 
former. On the contrary it is a substantial accomplishment to show that a 
relatively straightforward concept such as a consumer’s expenditure func- 
tion contains within it all the information obtainable from his (inaccessible) 
utility function and that under specified circumstances: the properties of 
the utility relationship can be deduced from the expenditure function. 

The dual approach then offers us at least three principal advantages: 


1. It enables us to formulate many problems in a way that is 
“natural,” i.e., translatable into intuitive ways of looking at the 
analyses. 

2. It is often more readily adaptable to empirical estimation. 

3. It facilitates the processes of theoretical deduction and proof. 


In particular, as we will see, one important case of the last of these 
advantages lies in the field of comparative statics where it permits the 
elimination of many long and tedious calculations, substituting for them 
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calculations which are basically. simple, even though the unfamiliarity of 
their approach may make them seem somewhat more difficult than they 
actually are. 

As often happens in a relatively new branch of analysis, the duality 
literature does not generally make easy reading for the uninitiated. Some 
of the materials are, indeed, inherently complicated, but there is also a 
great deal, encompassing some of the most fundamental parts of the 
analysis, that can be understood without too much trouble. These are of 
course the portions of the discussion upon which this chapter concentrates. 


6. The Expenditure Function as a Substitute for Utility Analysis 


The basic structure of the conventional theory of the consumer, it will 
be recalled, consists of a simple constrained maximization model. The 
consumer is assumed to want to select that combination of purchases that 
maximizes his utility subject to his budget constraint, i.e., he is taken to 


Max u(zi, ^: - , tn) 
subject to 


È pit; < m. 


The dual of this problem, whose interpretation is obvious, describes the 


consumer as a minimizer of expenditure, Y, p,z;, for whatever level of 
utility he attains. He is taken to seek to 


(8) Min E = Ð pa; 
subject to 
(4) u(zi, >, En) > u*, 


where u* is some given level of utility. 
. We may now define the ezpenditure function, E(pi, +>- , Pn, u*), as the 
minimum level of spending necessary to achieve the given level of utility 


u* when prices are set at p = p4,--- » Pn, le. it is the value of E obtained 
from the solution of problem (3), (4). More explicitly 


(5) E(», u*) = E(pi,---, Pm ut) = 2 pad, 


where the z? are the optimal values of the T; 


obtained f i 
to the dual problem (3), om ereot 


(4) and p represents the vector of parameter 
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values, pı, ** - , Pn- The expenditure function, which is expressed entirely 
in terms of observable prices and outputs, involves none of the abstract 
and disputable issues that arise in an analysis basing itself on the utility 
function. Yet one remarkable feature of the theory is that a '^well-behaved" 
expenditure function, i.e., one with certain desirable properties which will 
soon be spelled out, contains within it an indirect utility function which is 
also certain to be well behaved. That is, not only can one deduce the 
expenditure function fro the utility function via (3), (4), but one can 
proceed in the reverse route, going from expenditures to utilities, and in 
each case, desirable properties of the function with which one begins the 
calculation imply analogous desirable properties for the function which has 
been deduced. 

These and other significant properties of the expenditure function are 
contained in a number of theorems to which we turn next. 


7. Three Properties of the Expenditure-Utility Relationships’? 


A utility function for our purposes may be considered to be well 
behaved if it possesses three basic properties: 


1. There are some values of T1, * --, z, for which the consumer 
is not sated!? (the quantities with which our discussion deals will all 
be taken from the region of nonsatiation). 

2. The utility function is strictly quasi-concave (Ch. pter 9, 
Section 7), meaning, in effect, that in the zone of norisatiety its in- 
difference curves are "convex to the origin.” 

3. The utility function is continuous in the variables 21, * * * ; Za. 


These properties are not all required in order to arrive at all the 
propositions that follow, but they are sufficient for their derivation. To 
avoid complications we will therefore assume that they hold for any utility 
function from which we wish to deduce an expenditure function. For many 
purposes it is also convenient to assume a further property: 


4. The utility function has first and second partial and cross- 
partial derivatives in all the variables z;, «+ - , tn, i.e., the derivatives 
ðu/ðz;, 0?u/dx?, and 02u/x; 02; exist for all relevarit commodities 7, j. 


12 The materials in the remainder of this chapter are relatively advanced. 

13 It will be recalled, from Section 7 of Chapter 9, that nonsatiation means that the 
set of z values lies in à region such that consumers prefer larger quantities of any and 
all of the commodities in question and that, consequently, indifference curves for any 
pair of the n commodities must have negative slopes. 
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Obviously, if property 4 holds, then property 3 becomes redundant. 
We may note that the properties we have just discussed are essentially 
those assumed in conventional utility and indifference analysis as described 
in Chapter 9, and they permit us to deduce all the standard results of that 
analysis. That is why we may refer to a utility function that has these 
properties as being “well behaved.” With these conditions satisfied the 
dual problem (3), (4) will possess a solution, i.e., the expenditure function 
then exists. 

We now deduce immediately 


Proposition 1: The expenditure function is a linearly homogeneous 
function of prices. That is, if all prices are increased k-fold, then the 


minimum expenditure necessary to attain the vtility level u* is also 
increased k-fold. 


The proof of Proposition 1 is completely trivial. Suppose all prices are 
multiplied by k. Referring to (3) and (4) this will obviously not change the 
optimal consumption quantities, z?, for whatever z's minimize Y, pi; 
must also minimize the value of the objective function after the price 
change, >) kp;z; = k » pix; when both are subject to the same constraints 
since the price change merely multiplies the objective function by the 
constant, k. Thus, by (5) the new expenditure function becomes 


(6) E(kp, u*) = X. kp;z? = kE(p, u*), 


which is what we were to prove, i.e., that multiplication of each price by 
k also multiplies the expenditure function by k. 


Proposition 2: The expenditure function is strictly monotonically in- 
creasing with utility level u*. That is, higher utility levels can be attained 
by the consumer only if he increases his expenditure. 


Proof: Because the consumer is not sated and prices are positive, if u* 
is to be decreased slightly to u* — A, one way in which the consumer can 
certainly save money is by a small proportionate decrease in his consumption 
of all commodities sufficient to get him down to the lower utility level. 
Since he will be able to save at least this amount by the CONARI 
level of expenditure, it follows that we must have E(p, u* — A) < EQ, v*). 


We now come to a result which is far less obvious and only slightly 
more difficult to prove: 


Proposition 8: If the utility function satisfies properties 1-4, then the 
expenditure function is concave in prices. 
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. It will be recalled (Chapter 9, Section 16) that a concave function is 
defined as one which, intuitively speaking, is “concave downward.” 
Mathematically, 


Definition: the function y = f(p1, ..., Pn) = f(p) is concave if given 
the line segment connecting any two points on its graph, p’ = (pi, ..., p») 
and p” = (p, ..., pz), then for any p* = wp’ + (1 — w)p" which is an 
interior point on the line segment connecting p' and p" the “height” above 
p* of the line segment connecting f! (p) and f(p’’), i.e., wf(p^) + (1 — w)f(p"), 
wil be less than that of the corresponding point on the graph of the 
function, f(p*). That is, for any constant w such that 0 < w < 1, f(p) is 
concave if 


(7) f(p*) = flwp' + à — w)p"] > wf’) + A — wfo”). 


We now offer the proof of Proposition 3, leaving until afterward the 
discussion of its economic content. 

Let x! = (Tut, Tr), 2" = (25,:--,25), and z* = (zt,---, af) 
represent the optimal (least-cost) solutions under the three sets of prices 
pi, pi, and p£, respectively, where pf = wp; + (1 — w)pi; that is, the př 
are a weighted average of p; and pi, so that p* is any point on the line 
segment connecting p' and p" (see Section 15 of Chapter 9). By (4), 2’, x”, 
and z* each yield at least the utility level u*. Hence at prices p; the bundle 
(zt, ---, zf) must be at least as expensive as (71, ^^: ; zh), the least-cost 
purchase at these prices, i.e., 


(8) E pizt = X piz; = EO’, u*). 


Similarly, at prices p” the bundle z* must be at least as expensive as a 
so that 


(9) E pizt > È piz; E(p", u*). 


Consequently, at the prices p* = wp’ + (1 — w)p” for which the x¥ are 
optimal, 

(10 E(p*, u*) = E pfzf = © lwpi + (1 — w)pilet 

w > pirt + (1 — w) X pizt 

> wE(p', u*) + (1 — w)E(p",u*) [by (8) and (9)]. 


But by (7) relationship (10) is precisely what we mean by the concavity of 
E. Q.E.D. 
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Having completed our proof we turn next to the economic interpreta- 
tion of the theorem. The discussion will also help to clarity the concept of 
concavity. 

Taken literally, Proposition 3 asserts that if we consider two alternative 
sets of prices, then a (weighted) average of these prices will certainly not 
decrease (from the average cost under the initial sets of prices) the amount 
of money which a consumer must lay out to achieve a given level of utility; 
and, indeed, at the averaged prices the consumer may well have to spend 
more for the purpose. Thus the theorem says, in effect, that price averaging 
tends to make it more costly to maintain one's real income. But why should 
this be true? A simple example will show intuitively why it is so. Suppose 
that the only two items whose prices vary are two close substitutes, say, 
1969 wines of two vineyards in Ste. Julien, call them b and t. Suppose in 
the first price set b sells for $4 per bottle and t sells at $8, while in the second 
set the prices are reversed. In either case the consumer can then obtain the 
10 bottles needed to satisfy his craving for Bordeaux by purchasing 10 
bottles of the less expensive wine at a total outlay of $40. If, however, the 
prices were simply averaged so that each wine sold at $6, obviously the 
cost of any 10 bottles would now be $60. In sum, the theorem indicates 
that nonuniformity in the price of substitute goods provides an opportunity 
for saving by purchasing the cheaper of the available substitutes and that 
averaging of prices, by making costs more uniform, eliminates such 
bargains. 


8. Dual Properties of Utility and Expenditure Functions 


We can now quickly summarize without proof! * several key propositions 
about expenditure functions, before turning to another extremely important 
result whose proof will be outlined. 


Proposition 4: If the utility function has properties 1-4, then the 
expenditure function is differentiable with respect to u* and to commodity 
prices (assuming the prices in question are positive), and it is monotone 
nondecreasing with the prices, i.e., an increase in prices will never reduce 
the cost of attaining a given level of utility. 


Next we have the basic duality theorem, which asserts that there exists 
a well-behaved utility function corresponding to every well-behaved 
expenditure function. 


14 For proofs, see e.g., Ronald Shephard, op. cit., and Daniel McFadden, “Cost, 
Revenue and Profit Functions,” in D. McFadden (ed.), An Econometric Approach tc 
Production Theory, North-Holland Publishing Company, Amsterdam, forthcoming. 
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Proposition 5 (the Shephard-Uzawa duality theorem): If an expendi- 
ture function, E', has the properties specified in Propositions 1-4, i.e., it 
is linearly homogeneous, concave in prices, differentiable, and mono- 
tonically increasing with u*, then there exists a utility function, 
wu (21,777, Zn), which has properties 1-4 and is such that if we deduce 
an expenditure function E" from w' then E" and E' will be identical. 


Since the expenditure function is a continuous function of prices and 
“utility levels," Y = E(p, u*), one can, in principle, invert the relation- 
ship to solve for the value of u* as a function of p and expenditure level, Y. 
A utility function derived in this way is called the indirect utility function. 
That such an inversion of the expenditure function to the indirect utility 
function is possible follows from Propositions 2 and 4, since if a function 
y = f(a) is differentiable, it must be continuous, and if it is also monotonic, 
it must possess a well-defined inverse z = f^! (y). This is easy to envision 
graphically because on the graph of such a function every z must corre- 
spond to a unique y and vice versa since the curve can neither have gaps 
nor can it ever “bend back" to yield, say, two values of y for any 2. 

However, the importance of the fundamental duality Proposition 5 is 
not that it suggests a procedure for the determination of utility functions 
but that it permits us, for many purposes, to dispense with the utility 
function altogether. Proposition 5 tells us that the expenditure function, 
in effect, contains within it an implied utility function, and that to assure 
ourselves that this indirect utility function is well behaved we need never 
actually find that function explicitly. For if the expenditure function has 
the desirable properties of linear homogeneity, concavity, and differen- 
tiability, then we can be certain that the indirect utility function also 
possesses the properties usually desired of it. 


9. The Compensated Demand Function. Shephard's Lemma. 


We come now to one of the propositions that underlies many of the 
theoretical applicatious of the duality analysis. We will show that one can 
determine directly the demand relationship, which underlies so much of 
theoretical and empirical analysis, by simple differentiation of the expendi- 
ture function with respect to the price of the item whose demand is being 
studied. However, the resulting relationship is of the sort called a com- 
pensaled demand function (see Chapter 9, Section 13). That is, it is a 
demand function from which the income effect has been removed so that it 
describes only the substitution effect. It does this by taking the consumer 
to have been compensated for the loss in his purchasing power that would 
otherwise occur when a price rises (or for the rise in his purchasing power 
resulting from a price fall). In other words, it tells us how a consumer's 
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purchases will change when some price, p;, is replaced by p; + Api and the 

consumer is simultaneously provided enough additional income to keep his 

utility level (real income) unchanged. Thus, the derivative of the com- 

pensated demand curve, óz;/8p;, is precisely the substitution effect of a 

change in the price of good ? upon the quantity of that good purchased. 
We may now state 


Proposition 6 (Shephard’s lemma): The partial derivative of the 
expenditure function with respect to the price of commodity 7 (assuming 
its price is positive) is equal to z9, the optimal demand for the ith good, i.e., 


à. * 
(11) aE(, u*) — 2°, 
Op; 
so that, since the value of v is held constant at u*, (11) is the compensated 
demand for commodity 2 as a function of prices. 


Proof:!? By definition of the expenditure function 


(12) E pix? — EQ, u*) = 0. 


Now consider any alternative set of prices, p' = (p4, ***, Ph), and the 
corresponding optimal set of consumption levels, z' = (25 02 
yielding the utility u*. Then since for the given utility level z' rather than 
z? is optimal (expenditure minimizing) under these alternative prices, p’, 
we must have 


(13) E(p', u*) = E. pizi € E pie? 
or 


(4) E piz? — E(p', u*) 2 0, for any set of positive prices p’. 


Comparing (14) and (12) we see. that expression (14) reaches its lowest 
possible value for any positive prices when p' = p. Since the prices p 
represent a minimum, they must satisfy the first-order conditions for a 
minimum of (14) that the partial derivative of (14) with respect to any 


15 The reader may wonder why we do not get our result simply by differentiation 
of the expression E(p, uw) =E pix? with respect to pi- But two functions which have 
an equal value need not have equal derivatives! At their intersection point a supply 
and demand curve have equal coordinates, but we normally expect one to have a 
negative slope while the other’s slope is usually taken to be positive. 
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positive price p; must be equal to zero. That is, we must have 


AX pis? — BO’, u*)) 9 EED o aya 
api Ree ete cup, 


which is precisely what (11) asserts. 


One property of demand relationships now follows immediately from 
Proposition 1, which asserts that the expenditure function is linearly 
homogeneous. It will be recalled from Chapter 11, Section 10, that given a 
function homogeneous of degree r, its partial derivatives will each be 
homogeneous functions of degree r — 1. Consequently, from Propositions 
1 and 6 we obtain 


Proposition 7: The compensated demand relationship derived from a 
well-behaved expenditure function is homogeneous of degree zero. That is, 
a proportionate change in all prices will leave quantities demanded entirely 
unaffected. 


10, The Slutsky Theorem and Other Results in Comparative Statics 


The power of duality theory is perhaps most forcefully illustrated in its 
applications to comparative statics. It will be recalled that comparative 
statics examines the effects on the values of the endogenous variables of a 
model of a change in the value of one of its parameters. For example, it 
investigates the effects on the consumer’s purchases of a change in his 
income or in the price of some commodity. 

The expenditure function is well designed to deal with such issues 
because its formulation represents the role of prices much more directly 
than that in the conventional utility maximization model. We will see now 
that results which require tedious calculations in the conventional analysis 
almost fall into our laps from the expenditure function and the compensated 
demand relationship derived from it. For example, we have the standard 


Slutsky theorem: 


Proposition 8: If the expenditure function is well behaved and has 
second-partial derivatives, then the substitution effect will be negative 
(or zero), i.e., holding utility constant an increase in the price of good 7 will 
lead to a reduction or at least no increase in the quantity of good 7 consumed. 


Proof: Since by Proposition 3 the expenditure function is concave in 
prices, we must have 


(15) 9? E/0p? X 0. 
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Now the substitution effect is given by ax?/ap; (u* constant). But by (11) 
and (15) 


—— 2-3 $0 QED). 


This proof is so short and simple that the magnitude of its accomplish- 
ment may be recognized only by those who have gone through the tedious 
derivation of the conventional analysis with its many differentiations, its 
bordered Hessian determinants, and its Cramer's rule calculations. Here 
we see one of the major benefits offered by duality theory: It enables us to 
translate the painstaking arguments of the standard comparative statics 
into utterly simple terms. Once the building blocks of the duality theory 
are laid down, many further results begin to drop out with little additional 
effort, as we will illustrate now.!9 


Proposition 9: Every good ? must have at least one substitute j, meaning 
that (neglecting income effect) if the price of j rises, people will switch 
demand to 7. That is, for every good, 7, there must be at least one good, j, 
for which dz;/dp; > 0 and for which, if the demand curve for 7 has a 
negative slope (so that dz;/dp; < 0), we have the strict inequality, 
Oz;/0p; > 0. 


Proof: Since x; = dE/dp; is homogeneous of degree zero by Proposition 
7, Euler’s theorem (Chapter 11, Section 10) tells us 


Pı 0z;/0p: + pz O2;/Ip2 + - - - -- pa 0z;/9p, = 0. 


Thus, if the ¿th term, p; dz;/dp;, is negative (nonpositive), at least one of 
the remaining terms must be positive (nonnegative). Q.E.D. 


Finally, we have the standard symmetry result 
Proposition 10: dx;/dp; = 0zj/0p;. 


Proposition 10 follows directly from the assumption that the expendi- 
ture function has continuous second derivations so that dz,;/dp; = 
8°E/ap; 0p; = 0°E/dp; 9p; = 9x;/dp;. This theorem asserts that the 
substitution effect of p; on the demand for i must exactly equal the sub- 


16 We may note why the expenditure function approach is able to dispense with 
explicit use of the second-order conditions on which the conventional analysis relies so 
heavily. The answer, of course, lies in the strict quasi-concavity of the implicit utility 
function corresponding to a well-behaved expenditure function. Strict quasi-concavity 
assures satisfaction of the second-order conditions for maximization of the consumer’s 
utility subject to a linear budget constraint. 
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stitution effect of p; on the demand for j. In other words, if 7 is a net 
substitute for j, then j must be a net substitute for 7 to precisely the same 
degree, as measured by absolute price response. 

This proposition shows the symmetry of the net substitution and net 
complementarity relationships, i.e., those that hold after removal of the 
income effect, as has been done in the compensated demand relationships 
with which we are dealing. That is, Proposition 10 shows that if good 7 is a 
net substitute for j meaning az;/ap; > 0, then j must be a net substitute 
for i, and the same must hold for complementarity. 


11. Cost, Revenue, and Profit Functions 


The dual approach to consumer theory has been used with equal 
effectiveness in dealing with production decisions and the decisions of the 
firm more generally. Just as the expenditure function is used as a substitute 
for the utility function in consumer analysis, in the theory of the firm 
under perfect competition {:"here all input and output prices are param-. 
eters) the cost function, the revenue function, and the profit function have 
been used to replace the production function, each taking on a different 
role that is conventionally assumed by the production function. As we will 
see, their relationship to the production function is almost perfectly 
analogous to that between the expenditure and utility functions. 

Specifically, the cost function relates to the firm’s input decisions given 
its output levels and input prices. The revenue function, symmetrically, 
relates to the multiproduct firm’s output decisions given the magnitudes of 
its inputs and the prices of its various products. Finally, the profit function 
relates to the combined decision: The choice of input and output quantities, 
given all input and output prices. 

We may note that systematic duality theory as described in this 
chapter was first presented in terms of cost and production functions. 
While there had been earlier pieces dealing with duality in consumer theory 
more or less peripherally,!" it was Ronald Shephard’s pathbreaking work, 
Cost and Production Functions (1953), that first explored the subject 
thoroughly. Indeed, the fundamental theorem showing the relationship 


17 This includes writings by E. B. Antonelli (1886), A. A. Konus (1924), H. Hotelling 
(1932), J. R. Hicks (1946), P. A. Samuelson (1947), and, above all, R. Roy (1942, 1947) 
by whom the earlier work was carried to its furthest extent. For fuller discussion of the 
history of the theory, see W. E. Diewert, “Applications of Duality Theory," in M. 
Intrilligator (ed.), Frontiers of Quantitative Economics, Vol. 2, North-Holland Publishing 
Co., Amsterdam, 1971. For an excellent description of the subject matter overall, see 
D. McFadden, “Cost, Revenue and Profit Functions," op. cit. 
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between the cost function and the input-demand function is known as 
Shephard's lemma.*® 

First, as we did for utility functions, we must list some properties which 
one might expect to hold for a well-behaved production set (the set of 
possible input-output combinations) : 


1. Outputs require inputs. In any feasible input-output combination 
with some positive output there must be at least one nonzero input. In 


other words, if the input quantities are r; = r2 = ::: = Tn = 0, then the 
outputs must be yı = ++: = Yn = 0 (with zero inputs nothing will be 
produced). 


2. Free disposal and feasibility of the origin. The input-output combina- 
tion corresponding to the origin of production space is included in the 
feasible production set, i.e., it is feasible to produce nothing using zero 
input quantities. This premise, which is not so innocuous as it sounds, 
implicitly contains a free disposal assumption, for if any undesired objects 
(“outputs”) happen to be present, the feasibility of the origin requires that 
they can be removed without using any inputs to do so, because otherwise 
nonzero values of some of the inputs would be required to attain zero values 
of the outputs. 


3. Bounded production frontiers, i.e., for any fixed set of input quantities 
rx,--++, 7%, the production frontier is bounded (i.e., with those inputs there 
is some level y! of each output 7 that cannot be exceeded given the 
quantities of the other outputs). 


4. Closedness. The feasible production set (as well as the production 
frontier for any fixed set of inputs and the production isoquant for any 
given bundle of outputs) is closed.!® 


5. Convexity of the production set. This means that physical returns are 
diminishing both in the sense that added inputs yield marginal products 
that diminish or remain constant and in the sense that marginal rates of 
substitution between inputs or between outputs either diminish or remain 


18 Recently, work in the area has multiplied, produced by newer writers such as 
D. McFadden, G. Hanoch, W. E. Diewert, S. N. Afriat, and H. Uzawa. 

19 A set is defined to be closed if, given any convergent, infinite sequence of points 
in the set, the limit point of that sequence is also contained in the set. Intuitively, a 
closed set is made up of an interior plus its boundary, while an open set includes the 
interior but not the boundary. Thus, the interior of a circle plus its circumference make 
up a closed set, while the set made up of the interior of the circle, but which does not 
contain its circumference, is open. 
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constant. That is, as the producer gives up more and more of a product y1, 
the amount of product y2 he can obtain instead from a given set of inputs 
does not increase at the margin, etc. 


12. On Cost and Production Functions 


A cost function C(p, y*) is defined as the solution of the cost 
minimization problem for the production of a given output bundle 
y* = (yf, ---, y%), i.e. it is the solution of 


Min © pir; 
(16) subject to 
90$ °° YR, Tn i Tm) <0, 


where g(-) X 0 is the production function, and where p; is the given price 
of input 7. That is, if r? is the optimal value of input 7 in (16), then the cost 


function is 
C(p, y*) = 2 pire. 


We will now state a number of propositions about the relationships 
between cost and production functions, each one the analogue of a proposi- 
tion in the theory of expenditure functions. We offer no proofs because 
each proof follows the logic of the analogous proposition about expenditure 
functions. 

Let us first offer the 


Definition: The set T is called a standard production possibility set if it 
is not empty and if it is well behaved, meaning that it satisfies conditions 
1-4 of the preceding section, i.e., the production frontier is bounded, the 
production set is closed, and zero inputs yield zero outputs. 


If the production set is standard and convex, then a solution to the cost 
minimization problem (16) exists, and šo the cost function has a non- 
negative finite value. Moreover, we have 


Proposition 11: The cost function is a linearly homogeneous function in 
prices for producible output bundles and strictly positive input prices. 


Proposition 12: The cost function is strictly monotonically increasing in 
outputs. 


Proposition 18: The cost function is concave in input prices. 
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Proposition 14: The cost function is differentiable with respect to input 
prices and output quantities, and it is monotone nondecreasing in input 
prices. 


Proposition 15 (Shephard): If the cost function, C", is standard in the 
sense that it satisfies Propositions 11-14, and if zero cost yields zero 
output and if cost is a continuous function of outputs, then there exists a 
standard production set 7’ such that if we deduce a cost function C” from 
T" then C" and C’ will be identical. 


The clear analogy between Propositions 11-15 for cost and production 
functions and Propositions 1-5 should illustrate their similarity. The reader 
can easily go through the remaining Propositions 6-10 for expenditure 
functions and formulate the cost function analogues for him- or herself. 


13. Revenue and Profit Functions: Definitions 


Since a similar array of propositions applies to revenue and profit 
functions, we will not even list any of them but confine oursclves to a 
formulation of their definitions, both of them obviously related to the 


production set. 
The revenue function gives the revenue that can be obtained from the 


optimal combination of outputs for any fixed bundle of inputs. That is, 


Definition: Given the output prices p1,---, Pn for outputs y1, * -+ , Yn; 
the revenue function for any fixed set of input quantities rt, ---, r$ is 
defined as 


R(p, r*) = X pa? 


for y satisfying the revenue maximization requirement 


Max © pii 
(17) subject to the production requirement 
gni Ve TH+ ++ 57m) SO. 


Finally, the profit function is obtained from an implicit and simul- 
taneous determination of inputs and outputs where we take z; to represent 
either an output or an input quantity, with ? being an output if z; > 0 and 
an input if z; « 0. Thus we have 


Definition: Given the input-output prices pi,---, Pw for output or 
input quantities zı, - - - , Zw, the profit function is defined as 7 = 7(p) = 
> pa, where z is the solution of the maximization problem 
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Max © pz: 
(18) subject to 


gi Zw) € 0. 


14. Hlustration l: Deducing the Cost and Profit Functions from the Production 


Function 


To give concreteness to the preceding discussion we now describe how 
one actually goes about finding the expenditure function from a given 
utility function, or the cost, revenue, or profit function, given the produc- 
tion function. We will illustrate the process by starting from a simplified 
though rather unrealistic production function 


(19) y = aLK, 


where y is output and K and L are the quantities of capital and labor 
inputs, respectively. The objective is to find the cost-minimizing values of 
K and L for some specified output level, y*, substituting these optimal 
values into PŁL + PxK to obtain the corresponding value of the cost 
function with input prices Pz and Px. Thus, substituting production 
function (19) into the cost minimization model (16) we obtain the 
Lagrangian 
PLL + PkK + My* — aLK) 


with first-order conditions 


Pj = aK 
Pr = aL 
y* = aLK. 


Hence, eliminating ^, we get PL/Pg = K/L, and substituting this into 
the last of the first-order conditions successively for K and L we obtain 


4* = aL?P,/Pr or k= (y*Pk/aPz)!? 


and, similarly, 
K = (y*P,/aPx)"? 


Consequently, the cost function is given by 


C(Px, Pr, y*) = PKK + PLL = 2(y*PxP1/a)"”?, 


Part 2 Towards Observability 369 


which involves the sharply decreasing marginal and average costs such as 
might be expected to follow from the extreme scale economies implicit in 
(19). From the analogue of Proposition 6 (Shephard’s lemma) the implicit 
demand functions for labor and capital are obtained directly by differentia- 
tion of the cost function with respect to Pz and Px to yield once again 


L = (y*Pk/aP;)!? and K = (y*P,/aPx)*?. 


The same process can be used to derive the cost, revenue, and profit 
functions for other types of production functions, though the calculations 
are usually considerably more tedious. For example, the standard Cobb- 
Douglas production function 


y = bL 
turns out, by exactly the same set of steps, to yield the cost function 
C(P1, Pr, y*) = 570 — a) a tP 
with the implicit demand function for labor given by 


acO _ ab-!(1 — a)-à-9a-*y*Pg-? PY 
oP 
B = b^ d a) 0—9g0—9)*p(-D pa—a) 


and that for capital is obtained analogously from 9C(.)/0Px = K. 

By exactly the same process one obtains a revenue function for a given: 
production function using the constrained maximum problem (17) instead 
of (16), which gave us the cost function. Finally, the profit function is 
deduced from (18) also using the procedure that has just been described.?? 


15. Illustration Il: Useful Cost Functions Derived Independently of Production 
Functions 


One of the most attractive features of the duality approach is that it 


permits us to postulate directly expenditure, cost, or profit functions in 


20 Tt should be noted that the Cobb-Douglas function does not have a profit 
function for, with constant returns to scale and constant input and output prices, 
profit per unit will be constant. Hence, if profit per unit is zero, the output level will be 
indeterminate and the total profit zero, no matter what the output level. If profit per 
unit is positive, then neither optimal output nor total profit will be finite. 
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convenient forms, without deriving them from any utility or production 
functions or even explicitly checking them against such associated func- 
tions. One need merely look at a specific cost or expenditure function to see 
that it satisfies several basic requirements such as linear homogeneity and 
concavity in prices, and we know by Propositions 5 and 15 that the 
associated production and utility functions will be well behaved. One can 
then design a cost or expenditure function that is convenient for whatever 
purposes are at hand and simply check the properties of the postulated 
relationship itself in order to infer from Proposition 5 or 15 that all is well 
with the associated production or utility functions. 

A good illustration is provided by Diewert, who has devised what he 
calls a generalized Leontief cost function,?! whose equation is 


Cp, y*) = h(*) E E bapi "pj", 


where p; is the price of input i, y* is the selected output level, and h(y*) isa 
function of y* that is continuous, monotonically increasing, and such that 
h(0) = 0, with h tending toward infinity with y*. It is also postulated that 
the parameter values satisfy b;; = bj;. 

We note first that this cost function is linearly homogeneous in prices 
[multiplication of each p; by the same constant, k, multiplies the entire 
function, C, exactly by (k?) (51?) = k]. 

Second, the function obviously increases monotonically and con- 
tinuously with y* by the assumed properties of h(y*). f 

Third, provided the parameters, b;;, satisfy a certain set of inequalities 
which there is no point in reproducing here, the function will be concave 
in prices. 

In these circumstances the conditions for Proposition 15 are satisfied, 
and we are therefore sure without further investigation or without explicit 
specification of any production function that there exists a well-behaved 
production function (i.e, a standard production possibility set) that 
corresponds to Diewert's cost function. 

Equally noteworthy is the linearity of the cost function in the param- 
eters b,;. This is important because if one is to use econometric methods to 
estimate the parameters of the cost function then the b;;'s are the unknowns 
whose values must be estimated from the data. The linearity of the cost 
function in the b;; means that one can make use of all the simplifivations 
that become possible for econometric methods in a linear case. 

To illustrate further the properties of this cost function we deal with 
the two-input case and set y at such a level that h(y*) = 1. In that case the 


21 See. W. E. Diewert, “An Application of the Shephard Duality Theorem: A 
Generalized Leontief Production Function," Journal of Political Economy, Vol. 79, 
May/June 1971, pp. 481-507. 
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cost function becomes (since pl/?p]/? = p,, and we have assumed 


bis = b21), 
C(p1, 92 y*) = biypi + 2b1 2p}! ?py!? + b22P2- 


By Proposition 6 we now can obtain the demand functions for inputs 1 
and 2 by differentiation of the cost function in turn with respect to the 
prices of these inputs. This yields the derived demand expression 


aC 
vn. = tı = by + bis(po/p1) ? 

(20) ac 
am = T2 = boo + bi2(pi/p2)"!?. 

P2 


We can use these derived demand functions to obtain the iso-product 
curve showing the alternative input combinations (zi, 2) that can yield 
y*, the given output quantity. If p; > 0, p2 > 0, from the two equations 
(20) we obtain directly the equation of the iso-product locus relating zz 
to zi: 

—bu bis 


ipa tlic I1 
(p2/P1) ie ERIT 


or 


(21) (zi — bii) (zo. — b22) = bis. 


Writing zı = (zi — b11) and 22 = (x2 — b22) it becomes clear that 
the graph of Equation (21) is a rectangular hyperbola (curve II' in 
Figure 3). 

The hyperbola is asymptotic to the z; and zg axes at which we have, 
respectively, z2 = bee and zı = bıı. The graph can, consequently, be 
redrawn as in Figure 4 with z; and zz on the axes instead of the 21 and Ze 


K= Dy 


25*X27Do2| i1 


Figure 3 
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of Figure 3. In Figure 4 the unshaded portion of the diagram is identical 
to Figure 3, with point A in Figure 4 corresponding?” to the origin of 
Figure 3. 


Figure 4 


Note that the curvature of the iso-product curve II’ depends on the 
magnitude of the b;;, which consequently determine the elasticity of 
substitution between the two inputs (Chapter 11, Section 13), for we have 
by (21) 

z2 = b22 + bio/(91 — b11), 


whose second derivative with respect to 2 is 


drz _ o biz 
dzi Qi — bu)? 


Thus, the curvature of the iso-product curve which determines the elasticity 
of substitution is itself determined by 512 and b11- 

In particular, if b12 = 0, then by (21) the II’ curve will include only 
the point zı = b11, 22 = b22 and all points with z; > bi1, t2 = bee or 
any point with z; = b11, 22 > bog since redundant inputs do not reduce 
output. Consequently, the production indifference curve then takes the 
L-shaped form implicit in the Leontief input-output model,?? as illustrated 
in Figure 5. This, of course, is why Diewert calls his relationship a 
generalized Leontief function. 


22 [t is clear that if bi: or b22 (or both) are negative then IJ’ will cross the zz or the 
z, axis (or both). In that case the corresponding input will be one which is dispensable, 
where zs is said to be dispensable if it is possible to have a positive y even if z; = 0. 

23 This is the case with zero elasticity of substitution (Chapter 11, Section 13) as 
we would expect for the fixed coefficients of the Leontief production relationships. 
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Figure 5 


Thus, we see from this example how duality theory permits us to 
postulate cost (or expenditure) functions which are tractable statistically 
(as is illustrated by the linearity of the generalized Leontief cost function 
in the parameters, b;;) and which offer us direct information on the pro- 
perties of the underlying functions (such as the preceding expressions for 
the curvatures of the production indifference curves). 
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The Firm 
and Its Objectives 


[5 


We have now discussed the data which the firm needs for its 
decision-making—the demand for its products and the cost of supplying 
them. But, even with this information, in order to determine what decisions 
are optimal it is still necessary to find out the businessman’s aims. The 
decision which best serves one set of goals will not usually be appropriate 
for some other set of aims. 


1. Alternative Objectives of the Firm 


There is no simple method for determining the goals of the firm (or of 
its executives). One thing, however, is clear. Very often the last person to 
ask about any individual’s motivation is the person himself (as the psycho- 
analysts have so clearly shown). In fact, it is common experience when 
interviewing executives to find that they will agree to every plausible goal 
about which they are asked. They say they want to maximize sales and 
also to maximize profits; that they wish, in the bargain, to minimize costs; 
and so on. Unfortunately, it is normally impossible to serve all of such a 
multiplicity of goals at once. 

For example, suppose an advertising outlay of half a million dollars 
minimizes unit costs, an outlay of 1.2 million maximizes total profits, 
whereas an outlay of 1.8 million maximizes the firm’s sales volume. We 
cannot have all three decisions at once. The firm must settle on one of the 
three objectives or some compromise among them. 
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DEMAND = 
AVERAGE REVENUE 


MARGINAL 
REVENUE 


0 a Q ün OUTPUT 


output OQ, marginal cost equals marginal revenue—indeed, it is the 
crossing of the marginal cost and marginal revenue curves at that point 
which prevents further moves to the right (further output increases) from 
adding stjll more to the total profit area. Thus, we have once again estab- 
lished that at the point of maximum profits, marginal costs and marginal 
revenues must be equal. 


explicitly between it and its invalid converse. It is not generally true that 


This peculiar result is explained by recalling that the condition, **mar- 
ginal profitability equals zero,” implies only that neither a small increase 
nor a small decrease in quantity will add to profits. In other words, it 
means that we are at an output at which the total profit curve (not shown) 
is level—going neither uphill nor downhill. But while the top of a hill (the 
maximum profit output) is such a level spot, plateaus and valleys (minimum 
profit outputs) also have the same characteristic—they are level, "That is, 
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they are points of zero marginal profit, where marginal cost equals marginal 
revenue.” 

We conclude that while at a profit-maximizing output marginal cost 
must equal marginal revenue, the converse is not correct—it is not true 
that at an output at which marginal cost equals marginal revenue the firm 
can be sure of maximizing its profits. 


3. Application: Pricing and Cost Changes 


The preceding theorem permits us to make a number of predictions 
about the behavior of the profit-maximizing firm and to set up some 
normative "operations research" rules for its operation. We can determine 
not only the optimal output, but also the profit-maximizing price with the 
aid of the demand curve for the product of the firm. For, given the opti- 
mal output, we can find out from the demand curve what price will 
permit the company to sell this quantity, and that is necessarily the optimal 
price. In Figure 1, where the optimal output is OQm, we see that the - 
corresponding price is QnPm, where point Pm is the point on the demand 
curve above Qm (note that P m is not the point of intersection of the marginal 
cost and the marginal revenue curves). - 

It was shown in the last section of Chapter 4 how our theorem can also 
enable us to predict the effect of a change in tax rates or some other change 
in cost on the firm's output and pricing. We need merely determine how 
this change shifts the marginal cost curve to find the new profit-maximizing 
price-output combination by finding the new point of intersection of the 
marginal cost and marginal revenue curves. Let us recall one particular 
result for use later in this chapter—the theorem about the effects of a 
change in fixed costs. It will be remembered that a change in fixed costs 
never has any effect on the firm’s marginal cost curve (Chapter 3, Section 6) 
because marginal fixed cost is always zero (by definition, an additional 
unit of output adds nothing to fixed costs). Hence, if the profit-maximizing 
firm’s rents, its total assessed taxes, or some other fixed cost increases, 
there will be no change in the output-price level at which its marginal cost 
equals its marginal revenue. In other words, the profit-maximizing firm 
will make no price or output changes in response to any increase or decrease 
in its fixed costs! This rather unexpected result is certainly not in accord 


2 Again, this problem arises because our marginal maximum condition must be sup- 
plemented by a second-order condition—that the second derivative of profits be nega- 
tive, which means, in the present context, that the marginal revenue curve must cut 
the marginal cost curve from above (going from left to right). The reader should verify 
that this condition is satisfied at the profit-maximizing output OQ. in Figure 1 but that 
it is violated at OQ;. He should also give an economic interpretation of the condition. 
Compare Section 5 of Chapter 4. 


382 The Firm and Its Objectives Chapter 15 


with common business practice and requires some further comment, which 
will be supplied presently. 


4. Extension: Multiple Products and Inputs 


The firm’s output decisions are normally more complicated, even in 
principle, than the preceding decisions suggest. Almost all companies pro- 
duce a variety of products and these various commodities typically compete 
for the firm’s investment funds and its productive capacity. At any given 
time there are limits to what the company can produce, and often, if it 
decides to increase its production of product z, this must be done at the 
expense of product y. In other words, such a company cannot simply 
expand the output of x to its optimum level without taking into account the 
effects of this decision on the output of y. 

For a profit-maximizing decision which takes both commodities into 


account we have a marginal rule which is a special case of Rule 2 of 
Chapter 3: 


Any limited input (including investment funds) should be 
allocated between the two outputs z and y in such a way that 
the marginal profit yield of the input, 7, in the production of z 
equals the marginal profit yield of the input in the production of y. 


__ The reasoning behind this result is straightforward. If the condition is 
violated, the firm cannot be maximizing its profits, because the firm can 
add to its earnings simply by shifting some of 7 out of the product where 
it obtains the lower return and into the manufacture of the other. 

Stated another way, this last theorem asserts that if the firm is maxi- 
mizing its profits, a reduction in its output of z by an amount which is 
worth, say, $5, should release just exactly enough productive capacity, C, 
to permit the output of y to be increased $5 worth. For this means that the 
marginal return of the released capacity is exactly the same in the produc- 
tion of either z or y, which is what the previous version of this rule asserted.? 

Still another version of this result is worth. describing: Suppose the 
price of. each product is fixed and independent of output levels. Then. we 
require that the marginal cost of each output be proportionate to its price, 


3 The earlier rule states that the marginal profitability must be the same in both uses, 

` whereas now we have the marginal revenue of the input the same in the production of 

either z or y. But if a unit of resources costs D dollars, the marginal profit of i in the 

production of z (MP;.) equals its marginal revenue minus its cost, so that if marginal 
Profitability is the same in both uses we have 


MPs = MRiz — D = MRy — D = MP, 
80 that we must also have MR;z = MR,,, and conversely. 
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ie, that MC./P: = MC,/P,, where P, and MC, are, respectively, the 
price and the marginal cost of z, etc.* 

In this discussion we have considered only the output decisions of & 
profit-maximizing firm. Of course, the firm has other decisions to make. 
In particular, it must decide on the amounts of its inputs including its 
marketing inputs (advertising, sales force, etc.). There are similar rules for 
these decisions, as discussed in Chapter 11 and in Chapter 17, Section 6. 
The main result here is that profit maximization requires for any inputs t 
and j 

MP;/P; = MP;/P3, 


where MP; represents the marginal profit contribution of input 7 and P; 
is its price, etc. 

Having discussed the consequences of profit maximization, let us see 
now what difference it makes if the firm adopts an alternative objective, 
one to which we have already alluded—the maximization of the value of 
its sales (total revenue) under the requirement that the firm’s profits not 
fall short of some given minimum level. 


5. Price-Output Determination: Sales Maximization 


Sales maximization under a profit constraint does not mean an attempt 
to obtain the largest possible physical volume (which is hardly easy to 
define in the modern multiproduct firm). Rather, it refers to maximization 
of total revenue (dollar sales), which, to the businessman, is the obvious 
measure of the amount he has sold. Maximum sales in this sense need not 
require very large physical outputs. To take an extreme case, at a zero price 
physical volume may be high but dollar sales volume will be zero. There 
will normally be a well-determined output level which maximizes dollar 
sales. This level can ordinarily be fixed with the aid of the well-known 
rule that maximum revenue will be obtained only at an output at which 


the elasticity of demand is unity, i.e., at which marginal revenue ts zero. 
This is the condition which replaces the "marginal cost equals marginal 
MAEDA A ae bred ERE 


revenue" profit-maximizing rule. 
a to 


4 To see how this follows from the preceding version of our rule, suppose that $1 
in inputs produces K dollars worth of z and K' dollars worth of y. Then if one unit of z 
requires, say, $5 in inputs (marginal cost $5), one unit of z must be worth (approxi- 
mately) 5K dollars. Similarly, if it costs $9 to produce a unit of y, that unit must be 
worth 9K dollars. Hence we must have 


MC./P: = 5/5K = 9/9K = MC,/P,. 


All of these rules can also be derived with the aid of a Lagrange multiplier analysis, as 
shown in the last sections of Chapters 9 and 11. The reader can supply the proofs as 
an exercise. [ 
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Tf all other costs are added to advertising cost, we get the line which 
depicts the firm’s total (production, distribution, and selling) costs as a 
function of advertising outlay. Subtracting these total costs from the level 
of dollar sales at each level of advertising outiay, we obtain a total profits 
curve, PP’. 

We see that the profit-maximizing expenditure is OA,, at which PP’ 

attains its maximum, M. If, on the other hand, the sales maximizer's 
minimum acceptable profit level is OP,, the constrained sales-maximizing 
advertising budget level is OA.. It is to be noted that there is no possibility 
of an unconstrained sales maximum which is analogous to output OQ, in 
Figure 2. For, by assumption, unlike a price reduction, increased advertising 
always increases total revenue. As a result, it will always pay the sales 
maximizer to increase his advertising outlay until he is stopped by the 
profit constraint—until profits have been reduced to the minimum ac- 
ceptable level. This means that sales maximizers will normally advertise 
no less than, and usually more than, do profit maximizers. For unless the 
maximum profit level A,/ is no greater than the required minimum OP,, 
it will be possible to increase advertising somewhat beyond the profit- 
maximizing level OA, without violating the profit constraint, Moreover, 
this increase will be desired since, by assumption, it will increase physical 
sales, and with them, dollar sales will rise proportionately. 

The interrelationship between output and advertising decisions now 
permits us to see the reason for the earlier assertion that an unconstrained 
sales-maximizing output OQ, (Figure 2) will ordinarily not occur. For if 
price is set at a level which yields such an output, profits will be above 
their minimum level and it will pay to increase sales by raising expenditure 
on advertising, Service, or product specifications. This is an immediate 
implication of the theorem that there will ordinarily be no unconstrained 
sales-maximizing advertising level. Since its marginal revenue is always 
positive, advertising can always be used to increase sales up to a point 
where profits are driven to their minimum level. 


7. Choice of Input and Output Combinations 


The typical firm is a multiproduct enterprise (frequently the number of 
distinct items runs easily into the hundreds or even thousands) and, of 
course, it employs a large variety of inputs. This section examines briefly 
the effect of sales (rather than profit) maximization on the amounts and 
allocation of the firm’s various inputs and outputs. 

We obtain’ the following result, which may at first appear rather sur- 
prising: Given the level of expenditure, the sales-maximizing firm will 
produce the same quantity of each output, and market it in the same ways 
as does the profit-maximizer. Similarly, given the level of their total 
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revenues, the two types of firm will optimally use the same inputs in identical 
quantities and will allocate them in exactly the same way. This result may 
be somewhat implausible because one is tempted to think of some products 
or some markets as higher-profit, lower-revenue producers than others and 
one would expect the profit-maximizing firm to concentrate more on the 
one variety and the sales-maximizing firm to specialize more in the other. 
But we shall see in a moment why this is not so. 

It is easy to illustrate our result geometrically. In Figure 4 let z and y 
represent the quantities sold of two different products (or sales of one 
product in two different markets) or 
the quantities bought of two different 
inputs. The curves labeled Ry, Ro, 
etc., are iso-revenue curves, i:e., any 
such curve is the locus of all com- 
binations of z and y yielding some 
fixed amount of revenue. Similarly, 
CC’ represents all combinations of x 
and y which can be produced with a 
fixed outlay (total cost). The stand- 
ard analysis tells us that the point of 
tangency, T, between CC’ and one of 
the R curves, is the point of profit 
maximization. But it is also the point of revenue maximization because it 
lies on the highest revenue curve attainable with this outlay. This demon- 
strates our result. 

A little reflection should now render the result quite plausible. The point 
is simply that, given (he level of costs, since profit equals revenue minus costs, 
whatever maximizes profits must maximize revenues. Hence, differences 
between the profit and the sales-maximizer's output composition or resource 
allocation must be attributed not to a reallocation of a given level of costs 
(or revenues) but to the larger outputs (and hence total costs and revenues) 
which, we have seen, are to be expected to accompany sales maximization.” 

Explained in this way, our theorem is completely trivial. But when the 
sales-maximizer's profit constraint is taken into account a more interesting 
but closely related conclusion can be drawn. 


7 We conclude that when: the operations-researcher encounters the problem of allo- 
cating optimally some fixed quantity of a firm's resources, the values of all other decision 
variables being given, his answer will be exactly the same whether he is dealing with a 
sales- or a profit-maximizing firm. Such analytically derived equivalences can clearly 
permit significant economies in research. In this case, for example, it means that the 
operations-researcher may be able, when dealing with allocation problems, to avoid 
wasting effort in determining the order in which the company ranks sales and profit 


objectives. - 
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` We may view the difference between maximum attainable profits and 
the minimum profit level expected by the sales-maximizer as a fund of 
sacrificeable profits which is to be devoted to increasing revenues as much 
as possible. Since each output is produced beyond the point of maximum 
profits, its marginal profit yield will be negative. In other words, each time it 
increases the output of some product in order to increase its total revenue 
the firm must use up more of its fund of sacrificeable profits. This fund of 
sacrificeable profits must be allocated among the different outputs, markets, 
inputs, etc., in a way which maximizes total dollar sales. The usual reasoning 
indicates that this requires the marginal revenue yield of a dollar of profit 
sacrificed, e.g., by product x, to be the same as that obtained from a dollar 
of profit lost to any other product, y; i.e., we must have 


marginal revenue product of z . marginal revenue product of y 
marginal profit yield of z marginal profit yield of y 


This relationship indicates that, even in the sales-maximizing firm, rela- 
tively unprofitable inputs and outputs are to be avoided, whatever the 
level of outlay and total revenue. 


8. Pricing and Changes in Fixed Costs and Taxes 


Students consistently find one of the most surprising conclusions of the 
theory of the firm to be the assertion that fixed costs do not matter to 
prieing and output decisions. This piece of received doctrine is certainly 
at variance with. business practice, where an increase in fixed costs is 
usually the occasion for serious consideration of a. price increase. It is easy 
to show, however, that this is precisely the sort of response one would 
expect of the firm which seeks to maximize sales and treats its profits as a 
constraint rather than as an ultimate objective. For if, in equilibrium, the 
firm always earns only enough to satisfy its profit constraint, then a rise in 
overhead cost must mean that earnings fall below the acceptable minimum. 
Outputs and/or advertising expenditures must then be reduced in order to 
make up the required profits. The purpose of any such decrease in produc- 
tion is, of course, to permit an increase in selling price. 

This is very easily restated in terms of Figure 5. An increase in overhead 
costs means, geometrically, a uniform downward shift in the total profit 
curve by the amount of the overhead expenses. Hence, if overheads rise 
by amount CD, output will fall from OQ. to OQ', for at OQ. profits will now 


8 Cf. Chapter 4, Section 9. 
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be QR, which is less than the minimum acceptable level OP». By contrast, 
the change in overhead costs will leave the. profit-maximizing output un- 
changed at OQ/. For the added costs reduce the height of the “profit hill" 
uniformly, but they do not change the location of its peak. This result also 
has implications for tax policy. It has sometimes been held that there is 
nothing a company can do to shift any part of the corporation income tax 
on to the consumer or its employees. The profit-maximizing firm can gain 
nothing by raising its prices or changing its outputs in response to a change 
in corporation tax rates, provided that these rates are so structured that 
the higher the firm’s earnings are before taxes the more it gets to keep after 
taxes. The argument is almost exactly the same as the fixed cost analysis. 


TOTAL 


PROFIT 
Pm 
D 
R 
C 
o Qj Qe Qc QUANTITY 


Figure 5 


The corporation tax reduces the height of the total.profit curve, but it 
moves the peak of the curve neither to the right nor to the left. 

But, once again, if the firm wishes to maximize sales subject to a profit 
requirement, rather than maximizing profits, this conclusion loses its 
validity. When taxes are raised, the firm will be. motivated to increase its 
price (and, therefore, to reduce its output) in order to make up its lost 
profits. The explanation of the shiftability of this apparently unshiftable 
tax is simple—the sales-maximizing firm will, in effect, have a reserve of 
profits which it has not claimed (it has not maximized profit) but which it 
can fall back on when driven to do so by a rise in tax costs, though it can get 
back to its old profits only by some sacrifice in its sales. 

This concludes the discussion of the implications óf a sales maximization 
objective. In the present context the analysis is important primarily as an 
illustration of the effects of alternative objectives on the optimal decisions 
of the firm. It is designed to indicate the seriousness of the errors which 
can arise unless care is exercised in investigating the goals of a company 
before undertaking an analysis of its behavior and its policies. 
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9. Satisficing and Behavior Analysis 


Professor Simon has offered yet another persuasive hypothesis about 
the objectives of firms. He has argued that in many cases management 
recognizes implicitly or explicitly the complexity of the calculations and the 
imperfections of the data which must be employed in any optimality calcu- 
lation. As a result, firms frequently give up the attempt to maximize 
anything—profits or sales or anything else. Instead, they set up for them- 
selves some minimal standards of achievement which they hope will assure 
the firm’s viability and an acceptable level of profit. Firms which are satis- 
fied to achieve such limited objectives are said to “satisfice” instead of 
maximizing. Starting from this hypothesis, a number of investigators led by 
Cyert and March have attempted to develop what they call a behavioral 
theory of the firm—one which seeks to show how firms really act, not just 
how they ought to act if their decisions were all optimal. Using computers to 
simulate observed decision processes of a number of companies, they have 
achieved remarkable success in employing some of these programs to pre- 
dict company decisions. Though one may question whether they have 
provided a theory or an empirical approach and evidence for the construc- 
tion of a theory, the significance of the entire analysis is undeniable. Cer- 
tainly we can no longer operate comfortably on the assumption that profit 
maximization adequately explains all of the observed business behavior. 


10. Profit and Sales Maximization: Sample Calculations 


Example: Given the demand function P — 20 — Q and the total cost function 
C= Q* 4- 8Q 4- 2. 


(a) What output, Qi, maximizes total profit and what are the corresponding 
values of price, Py, profit, Im, and total revenue (sales), Ry? 

(b) What output, Q,, maximizes sales and what are the corresponding values 
of price, P,, profit, II,, and total revenue, R,? 


(c) What output, Qe, maximizes sales subject to the constraint II > 8, and 
what are the corresponding values of the other variables, Pe, IL, and Rz? 


Answer to a: 
total profit = II = PQ — C = —Q?-- 209 — Q? — 80 — 2 
= —2Q?+ 129 — 2. 
"Therefore, to maximize profit, we require 


DT +12 =0 or Qn = 3, 


aE mx - 
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Part 3 
and so 
Py = 20 — Qn = 17 
In = —2QÀ + 12Qn — 2 = —18 + 36 — 2 = 16 
Ru = QuPg = 3:17 = 51. 
Answer to b: 


total revenue = R = PQ = —Q? + 209. 


Therefore, to maximize sales, R, we require 


ak _ _.99420=0 Q, = 10, 
dQ = or Q,= 10, 


and direct substitution yields 
P,=10, I, = —82 R, = 100. 


Answer to c: 
Set II — 8 so that 
8 = —2Q? 4-120 — 2, ie, 
2Q? — 12Q + 10 = 0. 


Solving for Q we obtain the two roots Q — 1 and Q — 5. Since (by direct sub- 
stitution) the corresponding values of P and R are seen to be P(1) = 19, R(1) = 
P(1)Q(1) = 19, P(5) = 15, R(5) = 75, the constrained sales-maximizing value 
of Q is i 
f Q. = 5. 

The corresponding values of the other variables are 


P,=15, U,=8, and R,— 75. 


PROBLEMS 


In Problems 1 and 2, below, given the demand and total cost functions speci- 


fied, determine the optimal output, Q, price, P, total profit, II, and total revenue, E, 


(a) under profit maximization, 


(b) under unconstrained sales maximization, 
(c) under sales maximization subject to the specified constraint. 


1. P= 12 — 04Q, C = 0.6Q* + 40 + 5, constraint IL, > 10. 


2.P=16—Q+ 24/Q, C = 43 + 4Q, constraint II, 2 16. 
3. Let E be the price elasticity of demand and S be the price elasticity of sales, i.e., 
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Prove that S = 1 — E. Show, therefore, that if demand is inelastic (E < 1), 
a fall in price will reduce total revenue (S), etc. (elasticity Theorem I1, Sec- 
tion 4, Chapter 9). 


dP 1 
4. Given marginal revenue (MR) = T prove that MR = P(1 — =) 


5. Use the two preceding results to show that 


(a) If price elasticity of demand equals unity, then total revenue is not affected 
by the value of Q. 
(b) If P = 0, then marginal revenue equals zero if and only if E — 1. 


(c) The ratio between price and marginal revenue is constant if and only if E 
is a constant. 
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Market Structure, 
Pricing, 
and Output 


l6 


The determination of prices and output levels is very much affected 
by the competitive structure of the market. Here, "competitive structure” 
is a phrase which refers to the nature and extent of the monopolistic ele- 
ments, if any, that are present in any particular market situation. There 
exists a large body of literature which discusses various types of competitive 
conditions running the range from perfect competition to pure monopoly, 
and which seeks to analyze their effects on prices and output. It is con- 
venient to begin our discussion with a listing of some of the market cate- 
gories which have been investigated. 


1. Classification of Market Structures 


The economist has classified industries or groups of firms into several 
categories, depending on the nature of competitive conditions. The fol- 
lowing are fairly standard definitions: 

1. Pure competition: An industry is said to be operating under conditions 
of pure competition when the following requirements are met: 

(a) Many firms. There must be a large number of firms in the in- 
dustry, each of which controls so small a proportion of total output 
that its addition to or removal from the market has little or no effect 


on the market price; 
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(b) Homogeneity of products. All firms must be known by buyers to 
produce identical products (cf. the definition of monopolistic competi- 
tion below); 

(c) Freedom of entry and exit. Any individual or company with the 
funds and inclination must be able to enter (start or buy a firm in) 
the industry without artificial hindrances being erected against him, 
and any owner of a firm in the industry who can find a buyer may freely 
sell his company; 

(d) Independent decision-making. There must be no collusion. 


2. Pure monopoly: A firm is classed as a pure monopolist if it is the sole 
producer of some commodity for which there are no close substitutes and 
if it faces no imminent threat of competitors. 

3. Monopolistic competition with product differentiation: This is a market 
arrangement very similar to pure competition except for feature 1(b)— 
product standardization. Under product differentiation each firm produces 
goods which are different or which customers believe to be different from 
competitive products. The “product differences" may in fact not involve 
characteristics of the products themselves. More attractive wrapping, 
more convenient location, or special sales features such as better service or 
free gift coupons may be the basis for customer preferences and loyalties. 

4. Monopsony: A buyer's monopoly. 

5. Discriminating monopoly: A. firm which charges different prices to 
different customers for the same commodity. 

6. Bilateral monopoly: A single purchaser without competitors buying 
from a monopolist seller. 

7. Duopoly: A two-firm industry. This is a special case of 


8. Oligopoly: An industry with a small number of large firms producing 
the bulk of its output. 


Pure monopolies have always been rare if they ever existed at all. Pure 
competition also is rare, although there exist a number of commodities 
which, for many purposes, provide a good approximation: Grains and the 
stock market are two outstanding examples. 

Illustrations of the other market forms are readily found. For the de- 
fense industries the government is a monopsonistic buyer, and wage 
negotiation between unions and industry representatives sometimes closely 
resembles bilateral monopoly. Doctors are well-known price discriminators. 

However, the bulk of our enterprises seems to fall into the two re- 
maining classifications, monopolistic competition and oligopoly. Competing 
neighborhood retailers of all sorts are typically monopolistic competitors, 
each with his corps of more or less loyal customers and locational aid 


be 
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personality differences which make his products and services at least some- 
what distinct from those of his competitors. The lion’s share of manu- 
facturing is in the hands of oligopoly firms—steel, autos, tobacco—almost 
any present-day large industry, and it is in these industries that most 
privately sponsored operations-research work occurs. 

Let us now discuss the price and output determination process in some 
of these market situations. The tools which have been described in earlier 
chapters will permit us to deal with these cases fairly briefly. 


2. The Profit-Maximizing Competitive Firm 


The bulk of the analytic literature on price-output determination has 
traditionally been devoted to the case of pure competition. The reason is 
at least partly that such a case is more readily amenable to analysis so 
that it is possible to develop a far richer theoretical structure for this 
situation than for other market forms. 

First, let us examine some immediate consequences of the definition of 
pure competition. Under pure competition the demand curve of the firm 
is always horizontal (perfectly elastic). This follows from the first feature 
of the definition of pure competition—the relative insignificance of each 
firm so that no one of them can affect price noticeably. The single wheat 
farmer can do nothing about the day’s price in Chicago. If he raises the 
price of his wheat above the going price, he will be unable to sell anything, 
whereas he can gain nothing by cutting his price below the market price 
for at the prevailing price he can sell any amount he can be expected to 
produce. For him, then, there is no price decision to be made—the price 
figure is simply handed to him. 

In the short run the firm may, of course, end up making either profit or 
loss. But in the long run the free entry and exit feature of pure competition 
assures us that these profits or losses will disappear altogether! If the 
industry is profitable, new firms will be induced to enter it and compete 
with the already established concerns. The resulting increase in demand for 
inputs may bid up their prices and hence raise costs. Certainly the in- 
creased product supply can be expected to reduce its market price. Thus, 
profits will tend to be squeezed down toward zero or at least until no 
additional firms find it worth moving in. 

Similarly, if there is initially a net loss to firms in the industry, the exit 
of concerns will raise profits and ultimately it will eliminate the loss. 

Of course, this conclusion holds only if there are no autonomous changes 
in demands or costs during the period of adjustment. A foreign crop failure 
or the invention of more efficient equipment may suddenly restore high 
profits to wheat farming and so offset the influence of new entrants. Since, 
to some extent, such changes are always taking place, the adjustment 
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toward zero profits will always be imperfect. However, the forces working 
in that direction will nevertheless be there. 

Let us now examine in somewhat greater detail the nature of this com- 
petitive equilibrium toward which the market tends to adjust. Such a 
situation is depicted in Figures la and 1b. In these diagrams the horizontal 
line DD’ is the firm's demand curve. The curve is horizontal because, as 
already stated, no change in the firm's output is a sufficiently significant 
contribution to total market supply to affect the price. 

So long as there is no price discrimination, any firm's demand curve 
will also be its average revenue curve. The reason is that if all units of a 


MC 


, 
(AR=MR) 


Figure 1 


commodity are sold at the same price, the revenue brought in by an average 

unit must be its price. Hence DD’ is also the average 

we know that where an average 
marginal curve. Here, since the average 


venue cu t its length, it must everywhere 
coincide with the marginal revenue curve (Chapter 3, Section 5). 


of a competitive profit-maximizing 
t-maximizing output is OQ,, where 
cts the marginal revenue (demand) 

1 . 2 ie : 
S DD'.! At that point it is earning a profit, for on each unit it produces it 
obtains VW (= unit revenue minus unit cost). Thus its total profit (= unit 
nits produced) is represented by area 
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Figure 1b represents a long-run equilibrium situation. Here the average 
cost curve must be tangent to the demand curve, DD’. The reason is that 
if the unit costs were everywhere higher than price, every output would be 
unprofitable, and firms would leave the industry, thus shifting the curves 
toward tangency by raising DD’ (price) and, possibly, lowering the cost 
curves as well. Similarly, if the average cost curve were to intersect the 
demand curve, there would be some outputs at which profits could be 
earned and an influx of new firms would soon shift the cost and revenue 
curves sufficiently to wipe out these profits. Only when there is tangency 
will the “no-profit, no-loss" position of long-run equilibrium be the best 
the firm can do, and no firms will be tempted to enter or leave the field if 
this is the typical situation of all firms in the area. 

Since the demand curve is horizontal, the point of tangency, T, of the 
average cost curve with the demand curve must occur at an output at which 
the average cost curve is also horizontal, i.e., it will be at an output where 
unit costs are at a minimum. For that reason the marginal cost curve will 
also intersect the average cost curve at that point. In sum, at the point of 
equilibrium we have the impressive set of equalities, marginal cost equals 
marginal revenue equals average cost equals average revenue equals price 
(Figure 1b). 

It is to be noted that in equilibrium every firm in the industry must 
have the same costs, for the product price will be the same for all such 
companies, and both marginal and average costs will equal price for all 
firms. This may appear to smack of the miraculous. Firms with dissimilar 
resources, production techniques, and operating procedures all end up with 
the same costs. But this is another work of the competitive mechanism. 
More efficient firms must have lower costs because some of their resources 
are better—items such as more convenient location, purer raw materials, 
or more skilled managers must account for the difference. But competition 
guarantees that if manager A can run a firm at $10,000 more cheaply per 
year than can B, then A's salary will tend to be bid up until it is $10,000 
per annum higher than B's, for if A's firm pays him only $8,000 more, it 
will be in B's firm's interests to try to bid A away with an offer of $9,000 
and it will pay A's firm to hold him with a $9,500 counteroffer, etc. (Of 
course, if A is the owner of the firm he will simply gather these wages of 
his special skills in the form of profit in the noneconomist's sense of the 
word.) In this way all cost savings will tend to be paid out to the more 
efficient inputs that make them possible. Hence, since we must include 
these bonus payments, the costs of the more efficient firms will tend to be 
driven toward equality with those of the less efficient. 

One more point needs to be made here—the zero-profit rule may seem 
implausible at first glance. Why should anyone stay in business if it yields 
him no returns? But the term “profits” is used here in a rather strict sense. 
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2. The smaller the slope of the supply curve, the greater will be the 
proportion of the tax which is shifted to the consumer. 


These rules have a ready intuitive justification. For example, a steep 
demand means that buyers are willing to continue to purchase pretty much 
the same quantity almost no matter what the price. It is natural to expect 
that such anxious purchasers will end up paying the bulk of the tax. 


4. Supply Curves: Some Comments 


A supply curve is, of course, defined as a graph which shows what 
quantities of a commodity will be offered for sale at different prices—i.e., 
_ it summarizes the seller's quantity reaction to various prices. We have just 
seen that the long-run supply curve for the industry will coincide with its 
average cost curve. 

The competitive firm, too, will have a supply curve, and it will also be 
related to costs, but in a quite different manner. In fact, the profit-maxi- 
mizing firm’s supply curve will coincide with (a portion of) its marginal 
cost curve. This proposition can readily be demonstrated with the aid of 
Figure la. As we have seen, our firm will find it profitable to produce up to a 
point where price (marginal revenue) equals marginal cost. Thus, at price 
OD, the firm's supply will be OQ,, and, similarly, at price OU it will supply 
quantity OQ.. Both of these price-quantity supplied combinations are 
represented by points (W and T) on the marginal cost curve, CMC. Since a 
similar observation holds for any other price at which the firm is willing to 
produce, our result follows—the firm’s supply curve is the same as its 
marginal cost curve.? 

The supply curve is, strictly speaking, a concept which is usually rele- 
vant only for the case of pure (or perfect) competition, and it will therefore 
not be encountered in later sections of this chapter. The reason for this lies 
in its definition—the supply curve is designed to answer questions of the 
form, “How much will firm A supply if it encounters a price which is fixed 
at P dollars?” But such a question is most relevant to the behavior of 
firms that actually deal with prices over whose determination they exercise 
no influence. Only in two situations may we expect firms to encounter such 
preset prices—if there is a central authority who sets prices by fiat, and in 
conditions of pure competition where the price is set by an impersonal 


2 Strictly speaking, the supply curve includes only the rising segment of the firm’s 
marginal cost curve. At price OD the firm will produce OQ; and not OQ,, as has already 
been noted in footnote 1, above. Therefore, a point such as L, on the descending portion 
of the marginal cost curve, will form no part of the supply curve. Moreover, even points 
on the ascending portion of the marginal cost curve must be excluded from the supply 
curve if they lie below the average variable cost curve. At any price which lies below average 
variable cost the firm is better off supplying nothing. 
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market mechanism outside the control of any buyer or seller. In most other 
circumstances the firm will be able to set its own price, so the information 
given by the supply curve will be inapplicable to the operations of such a 
company. 


5. Pure Monopoly 


In the case of pure monopoly the firm and the industry coincide by 
definition—the monopoly zs the industry. The output of the monopolistic 
firm must therefore be compared with that of the industry under pure 
competition. This comparison can be made with the help of a diagram which 
combines the  supply-demand 
analysis of the competitive in- 
dustry with the marginal ap- 
paratus of the theory of the 
firm. This is done in Figure 4. 

Here DD' and SS' are the 
competitive industry supply and 
demand curves. Suppose, now, 
that & monopolist takes over the 
competitive industry, and that in 
the process there occurs no change 
in the basic conditions of demand 
and cost (including rents). The 
demand curve now becomes the 
monopolist’s average revenue 
curve. Moreover, as we have à 
seen, the long-run supply curve tends to approximate the average cost 
curve.? 

We can, therefore, construct the monopolist’s marginal cost and mar- 
ginal revenue curves, SMC and DMR, from this information by the 
methods of Chapter 3, Section 5. The monopolist's profit optimum output 
will, then, be OQ», which is clearly smaller than the competitive output 
OQc. So long as the slope of the supply curve is positive, or, if it is nega- 
tive, so long as it is less steep than the demand curve, this will always be 
the case. For if the average cost curve cuts the average revenue curve 
from below, all outputs, such as OQ;, above the competitive zero-profit 
point, OQc, will cause the firm to lose money, since average cost, QzC, will 


3 This assumes that the industry pays rent to suppliers of inputs because other 
industries compete for their use. If, in the competitive situation, some factor rents had 
been paid only because firms within the industry were bidding against cne another for 
the inputs, then with monopolization these rent payments would disappear and the 
monopoly average cost curve would lie below the competitive supply curve. 
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there exceed price, Q;P. Only to the left of the competitive equilibrium 
point will there be any profits. 

This is the standard well-publicized result that the monopolist tends to 
restrict his output. But this result must be treated with caution for two 
reasons. 


1. It is likely that cost and demand conditions will change when a 
monopoly takes over a competitive industry. By centralizing purchasing 
and, perhaps, having one man replace 100 independent buyers, by effecting 
economies of large scale through the combining of plants, inventories, etc., 
the monopolist may be able to reduce his costs. On the other hand, the 
larger monopolistic firm may require a more cumbersome, more costly 
administrative machinery. In addition, monopolistic advertising may 
increase the demand for the monopolist’s produtts. It follows that the 
simple comparison of monopolistic and competitive outputs of Figure 4 
cannot be relied on. 

2, Even if the competitive industry does produce the larger output, it 
is by no means obvious that this is always desirable from the point of view, 
of consumers or anyone else. In a period of full employment an increase in 
output in one industry takes away resources from elsewhere in the economy 
and forces a reduction in output and possibly a price rise there (the “guns 
vs. butter” problem). The basic point is that, under full employment, the 
determination of the outputs of the different industries is a matter of 
the allocation of resources. In popular discussions one tends to think 
that the larger the output of any industry, the better off is society, but 
it is easy to see that this can result in a misallocation of resources, just 
as, in the firm, a lopsided investment policy biased excessively toward one 
“department may ease that department’s operations but is hardly likely to 
be optimal from the point of view of the business as a whole. 


6. Monopolistic Competition (Product Differentiation) 


Chamberlin’s analysis of monopolistic competition deals largely with 
the individual firm and does not refer directly to the industry. In fact, it 
may be almost impossible to define an industry in such a situation. Dif- 
ferentiation of product means that no two firms put out the same item. 
Some products may, perhaps, be easily recognized as the same sort of item, 
but as one gets to less and less perfect substitute products one industry will 
tend to shade off into another. Hence, rather than well-defined industries, 
one tends to get something more like a continuum of products, although 
this assertion probably overstates the situation in practice. 

Under monopolistic competition the demand curve for the product of 
the firm may be expected to have a negative slope, even though the firm 
is as small as one operating under conditions of pure competition. For 
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customers will have different degrees of loyalty to the firms from whom 
they make their purchases. A small reduction in one firm’s price may only 
attract its competitors’ most mercurial customers. But, as larger and 
larger price reductions are instituted, it may acquire more and more cus- 
tomers from its rivals by drawing on customers who are less anxious to 
switch. 

The equilibrium of the firm involves the usual conditions—marginal 
cost equal to marginal revenue. Again, in the short run, the firms may or 
may not earn a profit. But under monopolistic competition one can also 
expect something like freedom of entry. Since firms are small, relatively 
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little capital is required to set up business and turn out a product not 
quite the same as but still very like those already on the market. 

The result is that, as under pure competition, both profits and losses 
will tend to be eliminated in the long run. The average cost curve will be 
driven toward tangency with the demand curve, and we will end up with 
a situation like that depicted in Figure 5. The equilibrium point, T, will 
be the point of tangency between the average cost curve and the nega- 
tively sloping demand curve, DD'. For at any other output, unit costs will 
be larger than price and so such an output will involve a loss to the firm.* 


* It follows that, since OQ; is the maximum profit output, marginal cost must there 
equal marginal revenue. This also follows from the standard relationship 


M = A+ Q(A/dQ), 
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price of his product because the costs of reimporting the product can be 
prohibitive or because of other obstacles to rcimportation. 


We may also note several other properties of the discriminating mo- 
nopoly case: 


1. If hesets his prices properly, the discriminating monopolist can always 
expect to earn at least as much as does an ordinary monopolist. For, in 
setting his prices independently in his different markets, he will always 
have the option of keeping the prices in several of his markets the same, 
if that is profitable. In other words, the discriminating monopolist can 
match every opportunity which is open to the ordinary monopolist and 
he has some others besides. 

2. The basic rule of profit maximization in discriminating monopoly 
is that marginal revenue must be the same in all markets to which the firm 
sells. For if its marginal revenue is greater in market A than in B, it can 
increase its profits by decreasing the amount shipped to B and transferring 
it to A. Only when this transfer process raises the price in market B and 
lowers it in A to a point where marginal revenues in the two markets are 
equal will the firm have arrived at an optimal allocation of its goods be- 
tween the two markets. The condition that marginal revenue must every- 
where be the same thus determines the discriminator's shipments to all 
his markets, and his total output is determined by setting the marginal 
revenue equal to marginal cost.* 


5 The last section of this chapter describes the algebra of decision-making under price 
discrimination. Graphically the matter is handled by a horizontal summation of the 
marginal revenue curves for the company's submarkets. Thus in the diagram EF on the 
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total marginal revenue curve RM equals AB + CD, the sum of the corresponding quanti- 
ties on the marginal revenue curves of marketa 1 and 2. The total quantity which will be 
supplied is OQe where the total marginal revenue, RM, equals the total marginal cost, 
MC (point F). This quantity is divided between the markets into amounts OQe, and 
OQe2 at which the marginal revenues are equal in both markets as required for profit 
maximization. The prices in the two markets are given by the heights of points P, and P5, 
the points on the average revenue (demand) curves corresponding to these quantities. 
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9. Bilateral Monopoly 


In this analysis it is convenient to extend the concept of the indifference 
curve and, in the present context, to be able to interpret it either as 
a curve of constant utility (as in the usual theory of the consumer) or as a 
curve of constant profitability so that any two points on such a curve are 
equally profitable. We deal here with the case of two people who are ex- 
changing two commodities. If one of the items exchanged is money, one 
of the bilateral monopolists (the money-payer) may be identified as the 
buyer and the other as the seller. Suppose that the buyer is an input-pur- 
chaser and that the seller is an input-supplier. We may draw an indiffer- 
ence map between money and X (the quantity of the input sold) for each 
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of these persons. In Figure 6a we have such an indifference map for the 
input-purchaser. Here it is assumed that the buyer starts off with $500 in 
money. In interpreting the indifference map it should be noted that we 
read down from this figure to determine the amount the buyer pays out. 
For example, at point A he ends up with X, units of X and $350 of his 
original supply of money left. That means he must have spent the difference, 
$150, for the X, units of A. 

Each such indifference curve is a locus of money-input purchase combi- 
nations which are equally profitable. AII points on the lower curve (such as 
B) yield $200 in profit. Those on the next curve yield $300, etc. 

A similar diagram can be drawn to represent the cireumstances of the 
Supplier, only in his case it will be the quantity of X which is measured from 
his maximum supply capacity level downward, and any one of his profit 
indifference curves must be the locus of all combinations of quantity sup- 
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plied and revenue which yield him a fixed level of profit (after deducting 
the cost of production of that quantity of output from his total revenue at 
that point). 

The two indifference maps can now be combined in an ingenious rec- 
tangular diagram? (Figure 6b). One of the indifference maps is turned upside 
down and the ends of the axes joined. Thus the buyer’s total money supply 
determines the length of the vertical axis, and the fixed input production 
capacity gives the length of the horizontal axis. Now, any point in the 
diagram may be interpreted as a trade. That is, it shows simultaneously 
where both of the bilateral monopolists will end up after an exchange. 
For example, point P represents a trade in which the buyer ends up with 
100 units of input and 300 units of money, while the seller ends up with 
the remainder—$200 (point M, on the right-hand axis) and 700 units of 
X in the form of unused production capacity—point X, on the top axis. 
(Note that the seller’s holdings are read downward and to the left from the 
upper right-hand corner, O', which is the origin of his upside-down indif- 
ference map.) Any point thus automatically indicates the ending position 
of both buyer and seller. Because of the fixed total of $500 in money, what- 
ever does not remain in the hands of the buyer must go to the seller and 
the same applies to X (or, rather, capacity to supply X). 

The solid indifference curves, B, are the buyer’s indifference curves 
whereas the broken S curves are those of the seller. 

Consider now the curve CC’, which is the locus of all points of tangency 
(such as T) between the buyer’s and seller’s indifference curves. CC’ is 
called the contract curve. It possesses two relevant features: 


1. For every trade point off the contract curve, there exist trade points 
on the contract curve which are mutually advantageous to buyer and seller. 
For example, consider point V, which is off the contract curve. Since it is 
not a point of tangency, the seller's and buyer's indifference curves S" and 
B which go through that point must intersect. Hence there will be a region 
between the two curves (shaded area) through which there passes higher 
profit indifference for both seller and buyer (e.g., indifference curve B’ 
yields more profit to the buyer than does B, and S’ is more profitable to the 
seller than S"). We conclude that all points on the arc of the contract curve 
TT"" which lies in the shaded region will be preferred to point V by both 
buyer and seller. 

2. Any move along the contract curve must be disadvantageous to one 
of the participants. Any move downward and to the left must be disad- 


® This device, as well as the concept of the contract curve, described below, was 
invented by Edgeworth. See F. Y. Edgeworth, Mathematical Psychics, Kegan Paul, 
London, 1881, pp. 17ff. For a derivation of the equation of the contract curve see Chapter 
21, footnote 11, below. 
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vantageous to the buyer (it gets him to a lower indifference curve), and 
any move in the opposite direction must adversely affect the seller, for an 
analogous reason. 


It has therefore been argued that the actual trading point must end 
up somewhere along the contract curve, CC’, for anywhere else it will be 
mutually advantageous to buyer and seller to renegotiate their deal, and 
only at a point on the contract curve will no such renegotiation be profit- 
able to both. 

The range of possible trading points can be narrowed down somewhat 
further. The point in the upper left-hand corner, N, represents the situation 
in which no exchange is made. The buyer holds on to his money and gets 
no X. Through this point there pass one of the buyer’s and one of the seller’s 
indifference curves (B and S'"). Neither buyer nor seller will be willing to 
accept any trade that leaves him on a lower profit indifference curve, 
for he can always refuse any such inferior proposition and stay at his 
“no-deal” point, N. This means that all possible trading points must lie 
in the region between the indifference curves through N, the shaded region 
in the diagram. For that reason, the only possible points on the contract 
curve are those on the are TT”. 

Beyond this it is difficult to narrow down any further the possible loca- 
tions of the final equilibrium point. Several suggestions have been offered— 
for example, the joint maximum point (the point which maximizes the sum 
of the profits of the buyer and seller together). However, it is difficult to 
see why one may expect that the bargainers should always be expected to 
end up at any one such point. For that reason, many economists have con- 
cluded that the bilateral monopoly problem is “indeterminate.” 

In fact, some have even suggested that the trade may well end up 
somewhere off the contract curve. If, for example, the trade happens to 
fall at point W, and the buyer feels that his bargaining position is so weak 
that a reopening of negotiation would move the trading point to somewhere 
along T'7", he may prefer just to let sleeping dogs lie. 

Similar indeterminacy problems will occur in the discussion of oligopoly 


which follows, and some degree of explanation will be offered in the course 
of the discussion. 


10. Oligopolistic Interdependence 


The oligopoly situation (including in this term the two-firm duopoly 
case) has one feature on which most of the economist’s attention has been 
centered. This is the interdependence in the decision-making of the various 
firms, an interdependence which is recognized by all of them. In an industry 
which consists largely of a small number of sizable companies, if one of 
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them opens a tremendous advertising campaign or designs a new model of 
his product which sweeps the market, he can be fairly sure that this will 
lead to countermoves on the part of his competitors. Every businessman 
in such a situation knows that at least some of his rivals’ decisions depend 
on his own behavior, and he must take this fact into account in his own 
decision-making. 

The reason for this interdependence in decision-making is, of course, 
that a major policy change on the part of one firm is likely to have obvious 
and immediate effects on the other companies which comprise the industry. 
As a result, the oligopolist has developed an armory of aggressive and 
defensive marketing weapons. For example, it is only under oligopoly that 
advertising comes fully into its own. Under pure competition no one has 
any motive to advertise because any producer can sell all of his product 
at the going price without incurring any advertising outlay. A monopolist 
will find some advertising to be-profitable, perhaps when he is introducing 
a totally new commodity or where there exists a considerable body of 
potential consumers who have never tried his type of ware. But under 
oligopoly, advertising can become a life-and-death matter where a firm 

. which fails to keep up with the advertising budget of its competitors may 
find its customers drifting off to rival products. 

As a result, the oligopolistic businessman is sometimes rather surprised 
when the presence of competitive conditions in his industry is questioned. 
To him competition consists not in the quiescent stalemate of perfect. com- 
petition where there is no battle because there is never anyone strong 
enough to disturb the peace. Rather, to him, true competition consists of 
the life of constant struggle, rival against rival, which one can only find 
under oligopoly (or, on a smaller scale, under conditions of monopolistic 
competition). 

Oligopolistic interdependence has another consequence which is of more 
importance for the economic literature than for the operation of the econ- 
omy. This feature of the situation has made the formulation of a systematic 
analysis of oligopoly very difficult. Under the circumstances a very wide 
variety of behavior patterns becomes possible. Rivals may decide to get ` 
together and cooperate in the pursuit of their objectives, at least so far as 
the law allows, or, at the other extreme, they may try to fight each other 
to the death. Even if they enter into an agreement? it may last or it may 
break down. And the agreements may follow a wide variety of patterns. 

As a result, the literature of oligopoly theory is full of different models, 
many of which describe, at most, one particular arrangement—a price- 


7 In any event, oligopolistic collusion will always be somewhat limited by legal 
restrictions and, in fact, by definition, for where all firms in an industry take all of their 
decisions jointly, they are in essence amalgamated into a monopoly. 
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leadership agreement or some particular method of using freight charges 
as a means for apportioning out market territories. 

An even more serious analytical difficulty arises directly out of manage- 
ment’s need to take account of its competitors’ reaction patterns. When 
a businessman wonders about his competitors’ likely response to some 
move which he is considering, he must recognize that his competitor, too, 
is likely to take this interdependence phenomenon into account. The firms’ 
attempts to outguess one another are then likely to lead to an interplay of 
anticipated strategies and counterstrategies which is tangled beyond hope 
of direct analysis, for in this way management is only led to advance along 
an infinite sequence of compounded hypotheses: “If I make move A, he may 
consider making countermove B, but he may realize that I might then 
respond by making move C, in which case . . . ," and so on ad infinitum. 

There are several ways out of this state of confusion, which is as un- 
satisfactory to the economic analyst as it is to the busines srnan who is 
saddled with its problems. Each of these approaches has something to be 
said for it. 


1. Ignoring interdependence. The firm may simply ignore the entire 
matter, on the assumption that its competitor will also do so. There is 
some reason to believe that this is, in fact, what many firms do in their 
more routine day-to-day decision-making. They ignore the interdependence 
of the returns to the various firms in the industry because, as a practical 
matter, these complex effects of minor policy changes are not worth the 
effort required to take them into account. Since interdependence disap- 
pears from decision-making with such an approach, the analysis of this 
case is simply the standard analysis of the theory of the firm which was 
discussed in the preceding chapter. However, in the analysis of a really 
major decision—when an automobile manufacturer considers introducing a 
radical new design or a cigarette manufacturer is about to embark on a 
major advertising campaign—the decision-maker knows he cannot afford 
to dodge the issue in this way, and the complex problems of interdependence 
must reenter the discussion. 

2. Predicting competitors’ countermoves. A second way of dealing with 
the problem is for a firm to attempt to anticipate the nature of competitive 
reactions on the basis of guesswork or past experience. For example, the 
decision-maker may know that his competitors have usually matched his 
price changes within a few days or he may simply guess that they are likely 
to do so. In such a case, it is possible to take this definite reaction pattern 
into account and decide on a strategy which is optimal in terms of this 
assumption. The remainder of this chapter describes models which employ 
this second approach to the analysis of the interdependence problem. One 
difficulty which arises here, as we shall see, is that if two competitors both 
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proceed on this sort of optimality calculation, each is likely to find that his 
prediction about the other was incorrect, because the optimality calculation 
will lead him to act in a manner which was not predicted, 

3. Preparing against optimal moves by competitors. A third approach to 
the analysis of the interdependence problem in business decision-making 
is that of the theory of games. Here the businessman does not guess at his 
opponent’s reaction pattern. Rather, he, in effect, calculates the optimal 
moves of the opposition—his rival’s best possible strategies— and prepares 
his own defenses and countermeasures accordingly. Discussion of this third 
alternative has been postponed until Chapter 18. 


11. Stability of Oligopoly Arrangements: Kinked Demand Curves? 


Let us now examine several oligopoly models which have attracted 
considerable attention. First let us consider one which is not designed to 
deal with oligopolistic price and output determination. Rather, it seeks 
to explain why, once a price-quantity combination has been decided upon, 
it will not readily change. 

The source of the problem is the fact that oligopolistic arrangements 
are notoriously undependable. For example, the history of price agreements 
contains case after case where “chiselers” seem to have found it advanta- 
geous to undercut the price which was agreed upon in order to grab off a 
larger share of the market. Yet, despite this phenomenon, prices in many 
oligopolistic industries appear to have exhibited a remarkable degree of 
stability, particularly in their resistance to change in the downward direc- 
tion. The model which will now be described is one possible explanation of 
the “stickiness” of oligopoly prices. 

Consider the effect on quantity demanded of a reduction in the price of 
a commodity. This is, as usual, shown by the demand curve for the product. 
Suppose, first, that the reduction in the price which is charged by our 
firm is matched by other competing concerns. In that case the company 
may expect to increase its sales slightly, but since it is not likely to get 
any customers away from its rivals in these circumstances, no large addition 
to its sales is to be anticipated. Its demand curve (DD' in Figure 7) will be 
relatively inelastic. 

Now suppose, on the other hand, that our company is the only one to 
reduce its price. In that case a much larger increase in its demand is to be 


8 The analysis of this section is based on the work of Sweezy, Hall, and Hitch. See 
Paul M. Sweezy, Demand Under Conditions of Oligopoly,” Journal of Political Econ- 
omy, Vol. XLVII, August 1939, reprinted in American Economic Association, Readings 
in Price Theory, George J. Stigler and Kenneth E. Boulding (eds.), Richard D. Irwin, 
Inc., Homewood, Ill., 1952, and R. L. Hall and C. J. Hitch, “Price Theory and Business 
Beffvior," Ozford Economic Papers, No. 2, May 1939. 
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expected. Thus, where no one else follows its price moves, the firm is likely 
to have a relatively elastic demand curve like dd’. 

Let point C represent the firm’s current price-quantity combination. 
It has been argued that the large oligopolistic firm is likely to anticipate the 
following competitive reaction pattern to a price change: 


1. Price reductions: If our company reduces its price, competitors will 
feel the drain on their customers quickly and so they will be forced to 
match this price cut. In other words, for downward price movements from 
point C, the relevant portion of the firm’s demand curve will be segment 
CD’ of the steeper demand curve DD’. 

2. Price increases: If the company raises its price, it may expect that its 
happy competitors will welcome the new customers which they gain from 
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the price-raising firm as a result, and they will have no motivation to match 
the price rise. Hence, for price rises the relevant part of the demand curve 
will be elastic segment dC. 


In sum, given this view of competitive reaction patterns the company’s 
demand curve will be the composite curve dCD’, characterized by a kink 
(a sharp corner) at the point C, which represents the current price-output 
combination. 

It is now easy to see that a company with such a competitive response 
pattern will be extremely reluctant to change its price. For a fall in its 
price will yield no large increase in sales, while a price increase will result 
in a substantial cut in business, and neither of these is a very attractive 
prospect. 

The reader should also be able to show with the aid of the geometric 
technique of Chapter 3, Section 5, that the marginal revenue curve in this 
case is broken line dUVW. If the marginal cost curve happens to pass 
anywhere through the gap VU in the marginal revenue curve, the profit- 
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maximizing firm will have no motivation to leave the current price, P. 
Even if there is, for example, a sharp rise in costs, so long as the marginal 
cost curve does not rise above point U it will lead to no price change. 

This analysis has been questioned on empirical grounds. ° Certainly it 
seems clear that in an inflationary period oligopoly firms do often follow 
one another’s price rises, contrary to what is assumed by this model. How- 
ever, the analysis does show how the oligopolistic firm’s view of competitive 
reaction patterns can affect the changeability of whatever price it happens 
to be charging. 


12. Reaction Curves and Oligopolistic Pricing 


Let us now see how the businessman may go about setting his price if 
he has some definite ideas about his competitors’ reactions to his decisions. 
For diagrammatic simplicity the discussion is confined to the two-firm 
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(duopoly) case. Figure 8a summarizes this anticipated reaction pattern. 
Reaction curve R,R; contains the relevant information about the price reac- 
tion of one firm, B, to the pricing decision of another firm, A. For example, 
point P on this curve indicates that if firm A sets price OP, for its product, 
and if firm B reacts in accord with the information given by its reaction 
curve, the price of B's product will become OP;. 

If B does stick to this reaction pattern, A's optimal price decision can 
be represented quite simply. The broken curves in Figure Sa represent the 
indifference curves of A's objective function, that is, they are his iso-profit 
curves if A is a profit-maximizer. Then the highest indifference curve which 


9 See George J. Stigler, "The Kinky Oligopoly Demand Curve and Rigid Prices," 
Journal of Political Economy, Vol. LV, October 1947, reprinted in American Economic 
Association, *hid.; J. L. Simon, “A Further Test of the Kinky Oligopoly Demand Curve,” 
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A can attain (the highest indifference curve compatible with B's reaction 
pattern) is IJ’, which is tangent to B's reaction curve at point T. To get 
to this point, A must set his price at OA,, and, accordingly, this must be 
his optimal price. 5d 

So far so good. But, unfortunately for the analysis, two can play at 
optimization. Figure 8b contains, in addition to B's reaction curve, RaRh, 
which indicates the manner in which B expects A to react to his prices. 
B, in turn, may now pick an optimum point, say V, on A's reaction curve, 
RaR}, and thus he will set his price at OB,,. But if both A and B choose 
these “optimal” prices, they will end up neither on point T nor on V. 
Rather, the resulting price combination will be represented by W, a point 
which lies on neither reaction curve. 

The result will be that both players will be surprised at their earnings— 
they may either be pleasantly surprised (on higher indifference curves than 
they expected) or they may be disappointed. More important, they will 
both realize that the reaction curves have become falsehoods, for neither 
player is now reacting in accord with the dictates of his reaction curve. 
Once they realize this, they will know also that their optimality caleulations 
have gone up in smoke. What was optimal for A so long as B stuck to his 
reaction curve need no longer be optimal once B strikes off on his own. 
Both firms must begin their calculations afresh, and we cannot say where 
they are likely to go from here.'? 

Thus we have not fully avoided the problems of interdependence in 
oligopoly price determination even if we have somehow found a reasonable 
method of constructing each oligopolist’s reaction curve (and the number 


American Economic Review, Vol. 59, December 1969; and W. J. Primeaux, Jr., and 
M. R. Bomball, *A Reexamination of the Kinky Oligopoly Demand Curve," Journal 
of Political Economy, Vol. 82, July, August 1974. These authors have collected data 
indicating that oligopoly prices are no stickier than those of monopoly firms and that 
price increases by one firm are as frequently followed as are price decreases contrary to 
what the model indicates. 

10 An alternative oligopoly model investigates what will happen if both firms stay 
on their reaction curves. It is easy to show that if the curves have the correct relative 
slopes the price combination must tend toward C, the point of intersection of A’s and 
B's reaction curves,-for if A sets price OA, B will move to point Bo, the corresponding 
point on his reaction curve (price OB), but then A will raise his price accordingly (he 
will move to point A, on his reaction curve), ete. Prices will then move along the path 
AoBoA,B,- -- toward intersection equilibrium price combination C. This is a generaliza- 
tion of the granddaddy of all oligopoly models, that of Cournot. See A. A. Cournot, 
Researches into the Mathematical Principles of the Theory of Wealth (1838), English 
translation, The Macmillan Company, New York, 1897. It is also possible to construct an 
Edgeworth contract curve from Figure 8b and Section 9, this chapter. This is, again, the 
locus of points of tangency of the participants’ indifference curves, and it Eus pro eren 
similar to those of the contract curve of bilateral monopoly. ? 
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of models which have been proposed indicates that this matter is far from 
cut and dried). 

This concludes our brief discussion of the theory of oligopoly and with 
it, our survey of the standard analysis of the various alternative market 
forms. A number of significant results have emerged from the analysis, 
but the need for a theory which is richer in empirical content seems quite 
apparent. Particularly, one is left with the feeling that the oligopoly analysis 
has involved a number of interesting observations and has provided us 
with a number of helpful analytical concepts, but some of its most critical 
questions remain unanswered. 


13. Monopoly, Duopoly, and Discrimination: Elementary Mathematical Analysis 


Example 1: The Calculus of Price Discrimination 


Given two isolated markets supplied by a single monopolist, let the two cor- 
responding demand functions be 


P,— 12— Q, and P,— 20— 3Q. 
Suppose the monopolist's total cost function is 
C = 3 + 2(Q1 + Qə). 


A. What will prices, sales, and marginal revenues be in the two markets under 
a regime of price discrimination, and what profit will the monopolist earn? 


Answer: The total profit in the two markets together will be 


I = PiQi + PQ: — C = 120: — Qi + 20Q2— 392 — 2(Qı + Q9) — 3 
= 100: — Qj + 18Q2— 3Q3 — 3. 


LI . 
Taking partial derivatives and equating them to zero we have 
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Hence, substituting these values of Q; and Q» into the demand and profit functions, 


II — 49, Pi= 7, P= 11 
and 


marginal revenue in market 1 (MR,;) = —— = 12 — 2Q = 2 


OPiQi 
ðQ 
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while 
MR: = 20 — 6Q2 = 2. 


B. Find the corresponding values if the monopolist cannot discriminate. 


Answer: Profits must now be maximized subject to the constraint! P, = Ps, 
ie. 


12— Qi = 20 — 3Q2. 
Using the same profit function as before, we have the Lagrangian expression 
ID = 10Q; — Qi + 18Q: — 303 — 3 + A(8 — 3Q2+ Q1) 


so that we require 


oT, 
— = 10-2 A=0 
30: Qi + 
ô 
— = 18— — 3\=0 
30: 6Q2 
ôl 
ax 835€: 3Q» — 0, 
whose solution is à = —2, Qı = 4, Qo = 4. Substituting into the relevant expres- 


sions we obtain 
II = 45, P,= 8, P2 = 8, 


= 4, MR. = —4. 


Note that in the discrimination case marginal revenues were equal in both markets 
while, where prices were equal in both markets, marginal revenue in one of the 
markets was actually negative. It should also be observed that one price rose and 
the other fell under discrimination but that the monopolist’s profits were higher 
in the discriminatory case. 


Example 2: Cournot Equilibrium vs. Joint Maximization 


Let there be two firms in an industry and let the profits of each be dependent 


1 It is also possible in simple cases to proceed just by adding the demand functions. 
We do this by first solving them for Q, and Q: in terms of P, and P; thus: Q = 12 — P, 


Q: = 20/3 — P;/3. Since P, = P: = P we may then add these to obtain the total 
demand, 


56 
9-0 t0 -7-Àip or p.95 3 


418 Market Structure, Pricing, and Output Chapter 16 


on both its output and the output of its competitor, thus: 
Il; = 24Q: — Qi — 202— 8 
I: = 30Q, — 3Q3 — 2Q, — 9. 


A. What will be the magnitudes of outputs and profits if each firm, following 
the Cournot assumption, chooses its output to maximize its own profit on the as- 
sumption that the other firm will not react to this output decision? 


Answer: In this case the firms will just set the partial derivatives of their own 
profits with respect to their own outputs equal to zero, thus: 


Olly 

aus 34 — =0 or — 12 
IQ 2Qı Qı 

[210 

— = 30— =0 or = 5. 
IQ: 6Q2 Q: 


Hence, by substitution, II; = 86, II; = 42, II = II; + IT; = 128. 


B. What will the firms’ profits and outputs be if they set output levels by col- 
lusion so as to maximize their joint (total) profits? 


Answer: Our objective function is now 


II = M + I: = 220: — Qi + 30Q2 — 59; — 17. 
Setting 


and solving, the reader should be able to show that we have 
Q = 11, Q2 = 3, Ih = 117, II, = 32, II = 149. 


Note that while total profits have risen, the profits of the second firm have de- 
creased under joint maximization, and a redistribution of profits may be required 
to secure its agreement to the arrangement. 


PROBLEMS 


1. Prove that if the demand and average cost functions of an industry are linear, 
the monopoly (maximum profit) output will be exactly half the competitive 
(zero-profit) output. Show this both algebraically and geometrically. 

2. Given the following demand functions for two separated markets and the total 
cost function of the monopoly supplier, what will be the prices, outputs, and 


The profit function can now be written 
56 3 3 
TPQ C= Qi 298 = 129 — 9E 3, 


whose maximum is Q — 8, P — 8, as before. 
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marginal revenues in the two markets, and the company’s total profits, (i) under 
price discrimination; (ii) with prices equal in both markets? 

(a) Pr=17—2Q, P2=25—3Q2 C=2+Q +Q: 

(b) &h22—Q P: = 9— 6Q2 C = Qı + Q2. 

3. Prove that if the marginal cost of supplying two markets is equal, then, under 
price discrimination, the marginal revenues in both markets must be equal. 

4. Given the following pairs of profit functions for two firms in an industry, find 
the profits and outputs (i) corresponding to a Cournot equilibrium; (ii) cor- 
responding to maximization of joint profits 
(a) I = 8Q1 — Q? — 2Q2 Il. = 10Q. — Qi — 4Q1 
(b m = 12Q,— 2} — Q: I. = 6Q2— Q2— Qi. N 
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Neumann-Morgenstern 
Utility Theory 


[7 


1. Utility, Risk, and Game Theory 


We have already seen (Chapter 9) that the neoclassical economists 
constructed a cardinal utility theory as part of their analysis of consumer 
demand. This cardinal utility measure was designed to convey information 
about the psychological state of the consumer (or the businessman and the 
worker)—the magnitude of his desires, and the psychic gains and losses 
incurred by the alternative actions which are available to him. 

We also saw that it is possible to dispense with this utility analysis in 
much of the theory of consumer behavior and in the analysis of economic 
decision-making in general. A large class of decision processes can be ex- 
plained simply with the aid of information about the individual's prefer- 
ences, with no attempt to assign magnitudes to them. 

Game theorists have had no dispute with the resulting ordinalist 
(indifference curve) analysis, so far as it goes. However, in their work they 
have found it exceedingly useful to go beyond an ordinal utility measure 
wherever questions of risk arise.! The outcomes of the alternative choices 
available to the decision-maker are sometimes known only in probabilistic 
form, e.g., if he does A, he has one chance in three of ending up with $1,000 
and two chances of three ending up with only $40. In such a case it is 


1 The role of this utility analysis in game theory will be indicated in the next two 
chapters in the discussion of the concept of “mixed strategies.” 


420 


Part 3 Neumann-Morgenstern Utility Theory 421 


essential in game theory to do more than just examine the decision-maker’s 
ranking of A as against his other similar alternatives, B, C, ---, etc. 

In the course of their work on game theory, von Neumann and Morgen- 
stern were therefore led to construct their much-discussed cardinal utility 
measure for the ranking of situations involving probabilities (risky situa- 
tions). This nomenclature turns out to have been highly unfortunate. 
The resemblance between the Neumann-Morgenstern construct and the 
neoclassical utility measure ends largely with the use of the term “cardinal” 
to designate both analyses. Much misunderstanding and unnecessary con- 
troversy can be traced to this usage. 

Let us now turn to a general discussion of the mathematician's use of 
the term “measurement” and its varieties to make clear what is meant by 
the term “cardinal utility" in the N-M (Neumann-Morgenstern) sense. 
Only then will it be possible to explain its relationship to the neoclassical 
analysis. 


2. Classes of Measures and Their Strength 


A measure, in its most general sense, is simply a device which is de- 
signed to convey information about the phenomena to which it refers. 
Normally the information is conveyed by means of numbers. This requires 
a linguistic convention which indicates the meaning of the number “105” 
when ^we say that the measure of some feature of an object is 105. For 
example, if this number represents length or temperature, its meaning has 
been well defined (once the system of measurement—say, Fahrenheit vs. 
centigrade—is specified). 

It is sometimes desirable that such a number convey a great deal of 
information whereas in other circumstances very little needs to be com- 
municated by a “measurement.” As a result, one encounters measures 
which vary considerably in power to convey information. Let us examine 
three such classes of index, starting with the least powerful and describing 
them in order of increasing information content. It will then be shown that 
an ordinal utility index and the N-M utility index belong, respectively, to 
the second and third classes. 

Class 1. Associative measures: The weakest type of index (one which 
conveys very little information) is one which serves only to associate items 
in two different collections. For example, persons in à newspaper group 
photograph are sometimes identified by placing a number next to each 
face and the same number beside the corresponding name in a list printed 
underneath. Here we set up the linguistic convention that any one man's 
name and face are given the same number. The numbers themselves do 
not matter and can be transformed in any way we like (we can exchange 
any number for another), provided no two faces bear the same number. 
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Class 2. Orderings or rankings: A ranking is a measure which assigns 
an ordering to a set of items—it tells us which of two items is higher on the 
scale. To take the standard illustration, in a hardness scale, harder minerals 
are given higher numbers. 

Note that such a ranking measure also contains associative informa- 
tion—two rocks with the same ranking number must be equally hard. 
Thus a ranking index does carry more information than an associative 
index. The former does everything which is done by the latter and more 
besides. 

Suppose we assign a set of numbers in such a ranking. We may note 
that the same information can be conveyed just as well by any other set 
of numbers provided that we still satisfy our linguistic convention. Of 
two rocks the one with the higher number in the initial numbering (the 
harder rock) must also bear the higher number in any new index which is 
assigned.” We describe this by saying that the index which measures, or 
rather describes, a ranking is unique up to a monotone transformation. Thus, 
while we still have quite a bit of choice in the assignment of numbers, we 
have considerably less option than in the case of an associative measure. 
That is because a ranking conveys far more information; thus the meaning 
of the numbers must be far more rigidly specified by the linguistic 
convention. 

It is clear that an ordinal utility measure is a ranking, so that it falls 
into this second class of indices. 

Class 8. Cardinal measures: Finally, we come to “cardinal measure- 
ment,” which conveys still more information than did either of the other 
types of measure. It permits us, from what we know about two items in 
isolation, to predict something about them in combination. Consider the 
problem of finding two pieces of cloth in a shop which together are large 
enough to cover a table at home. We wish to predict which two pieces will 
do this without having to take them home and try them (and return them for 
exchange if they turn out to be too short). A measure of length can help 
us here. We know that a table 3 yards long can easily be covered by two 
cloths whose lengths are 14 yards and 23 yards. 

It will be observed that a length index can be used to rank and to 
associate as well as for this sort of prediction. For example, we know just 
from the numbers and their standard interpretation that a 7-foot board 
is longer than a 3-foot board (ranking) and that all 3-foot boards are of 
equal length (association ). 

Because we want a cardinal measure to be capable of making the sort 
of prediction which has just been described, we have very little choice in 

* Clearly other linguistic conventions might do—softer minerals might be given 
higher numbers. 
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the numbers which are to be used. We wish to convey the prediction that 
whenever two items, A and B, are combined, we obtain some item, C, asa 
result. For this purpose it is customary to represent the process of com- 
bining A and B by the addition of the index of A to the index of B. For 
example, if A is a 6-foot board (a board of length index 6) and B is an 
8-foot board, we employ the linguistic convention that the length which is 
obtained by laying these two pieces of lumber end to end (item C) is 
6 +8 = 14 feet long.* That is why some cardinal measures are called 
additive. g 

The convention that we make our prediction by adding leaves us very 
little option in choosing the numbers of any such measure. Indeed, we 
have no more than two numbers which are up to us (two degrees of free- 
dom)* and the rest are then beyond our control—they are automatically 
dictated to us by our linguistic convention. For example, because we add 
lengths, we know that the length of two pieces of cloth must sum up to the 
number which we assign to the length they cover together. A cloth as long 
as a 3-foot piece and a 1-foot piece must be assigned the number ‘‘4 feet," 
and the three pieces together must be called *'8 feet,” etc. 

The N-M utility index is cardinal in this very specific sense—it is 
intended to be used for making predictions. It is employed to predict 
which of two lottery tickets (or which of two other risky alternatives) a 
person will prefer. We are given this individual's ranking of the alternative 
prizes offered by the lottery tickets and the odds on each prize. From this 
we wish to be able to infer by numerical caleulation, and without actually 
asking the person, which lottery ticket he will choose. 


? Here again we might use other conventions. Any process which uniquely assigns 
a third number to any pair of numbers will do the trick. We might, for example, use 
logarithmie rulers with slide-rule scales and get used to multiplying lengths—10 inches 
and 5 inches — 50 inches But the point is that to convey information we must start off 
with some sort of language. We must first set up our linguistic conventions in any con- 
venient way and then find the numbers which correctly convey the information in 
this language. 

* Actually a length measure conveys even more information than this, and so we 
have only one choice in assigning numbers to lengths—we can only decide on a unit of 
length—whether we will measure in centimeters or inches. The point is that in a length 
measure we have a well-defined zero. Mathematically, zero is defined as a number which, 
when added to another, leaves the latter unchanged. A nonexistent piece of cloth (a 
piece of zero length) has an analogous (the mathematicians call it isomorphic) prop- 
erty. Hence, in measuring length we have only one degree of freedom, the choice of 
units of measurement, and a length measure is said to be unique up to a proportionate 
transformation. When measuring utility or temperature (without an absolute zero) we 
have two degrees of freedom—the choice of unit and the zero (freezing of water, or some 
other point, as in Fahrenheit measure), and hence these measures are said to be unique 
only up lo a linear transformation. 
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In N-M utility measurement we want to assign to lottery-ticket prizes 
utility numbers which, when they are processed arithmetically in accord 
with the linguistic convention which is described in the following section, 
will assign a utility number to the lottery ticket itself. This lottery-ticket 
utility number should have the property that it ranks the lottery ticket cor- 
rectly in relation to any other possible ticket. That is, lottery ticket A 
should be assigned a higher utility number than is given to B by this calcu- 
lation if and only if the person prefers A to B. In sum, the N-M utility 
index is intended as a calculator of lottery-ticket preferences. (The term 
"lottery ticket" is of course meant to denote any alternative involving 
risk.) 

Here again, once we pick such utility numbers for any two alternatives, 
we will see that we are left no choice on the numbers to be assigned to 
other alternatives. This, then, is the sense in which the N-M utility meas- 
ure is cardinal—it is richer in that it conveys more information than does 
an ordinal utility ranking obtained, for example, by directly asking any 
individual to state all his preferences. In effect, the N-M index is an econ- 
omy device which requires an interviewer only to ask the subject to state 
some of his preferences—his ranking of lottery-ticket prizes. With the aid 
of the N-M utility index (if it is applicable to this person) the interviewer 
can then deduce by himself the person's ranking of all other alternatives 
from the answers he has already received. 


3. Construction of an N-M Index 


If an individual exhibits some degree of consistency in his preferences, 
it is possible and convenient to construct an index which describes these 
preferences numerically. This "utility index" does so by assigning a higher 
“utility” number to some item, a, than to another item, b, if the individual 
happens to prefer a to b. However, for their purposes von Neumann and 
Morgenstern required a somewhat stronger index than this—one which 
would enable them to make the sort of deduction which has just been 
described. For this purpose it was necessary to make a few more assump- 
tions about the consistency of the preferences of the individual in question. 
These assumptions are described in Section 5, below. First, however, let 
us examine the mechanics of their special utility index. 

Consider a lottery ticket which offers two prizes: The first prize is à 
Cadillac and the booby prize is a pair of roller skates. Suppose the odds 
are one in one thousand of winning, that is, the probability of winning is 
0.001 so that the probability of losing is 0.999. Suppose also that, somehow, 
we obtain some information about our individual’s attitudes toward the 
two prizes, and that, by a method to be described presently, we express 
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this psychological information by means of the statement that he values 
the Cadillac at 2,000 utils and the skates at 1 util. Then the N-M utility 
convention requires us to evaluate the lottery ticket at 


0.001 X 2,000 + 0.999 X 1 = 2.999 utils. 


More generally, if a lottery offers two prizes, A with probability P and B 
with probability 1 — P,? and if their respective utilities are U(A) and 
U(B), then the utility of the lottery ticket, L, is defined to be 


(1) U(L) = PU(A) + (1 — P)U(B). 


This simple calculation is all there is to the N-M evaluation of the 
utility of a lottery ticket, once we know the person's evaluation of its prizes. 
The crucial question is, then, how do we find the utility of these. prizes? 

In principle, this is accomplished by an extension of the preceding 
convention (1). For this purpose we design a special (artificial) lottery 
ticket which will serve as a standard of comparison. Consider two extreme . 
` prizes, 2 and D. These are chosen so that E is, in his opinion, as good as 

anything our individual is likely to end up with (mnemonic device: E = 
eternal bliss) and D is as unpleasant as anything he may plausibly expect 
(D = damnation).° Our standard lottery ticket, which we designate as 
S(P), offers our individual E with probability P and D with probability 
1 — P, where the probability number P is not specified (it is left free 
to vary). Let us assign any two arbitrary utility numbers to E and D, 
say U(E) = 100 and U(D) = 1.7 
Now consider any ordinary prize, A, and let us see how a utility num- 
. ber is assigned to A. For some values of P in the standard lottery ticket, 
S(P), the individual will prefer S(P) to A, and for other values of P the 
reverse will be true. For example, if P — 1 (certainty of eternal bliss), he 
will surely prefer S(P) to A, and if P =.0 (certainty of damnation), he 
will prefer A to S(P). It is therefore plausible that there will be some in- 
between value, Pa, at which our individual is indifferent between A and 
S(P.). Once we have found this in-between probability number, P, (say 
P, — 0.3), there is no difficulty in finding the utility of A. For A must 


5 If one or the other of A and B is certain to occur, their probabilities must, by def- 
inition, sum up to unity. Thus, if the probability of A is P, that of B must be 1 — P. 

* Actually it is not, necessary to employ such extreme prizes—any two arbitrarily 
chosen prizes will do the trick. However, the E and D concepts make the logic of the 
construction easier to follow. 

? This is where we use up our two degrees of freedom. There is, of course, one restric- 
tion on our choice of these numbers. By convention, we must have U(E) > U(D) since 
E is preferred to D. 


426 Neumann-Morgenstern Utility Theory Chapter 17 


have the same utility number as S(P.) since they are indifferent. But the 
utility of this standard lottery ticket, U[S(P.)], is easily calculated with 
the aid of our N-M linguistic convention equation (1). We have 


0.3 X 100 + 0.7 X 1 


ULS(P)] = P,UCE) + (1 — Pa) U(D) 
i 30.7 utils, 


ie., 
U(A) = 30.7 utils. 


To summarize, in order to find a utility number which represents some 
individual’s attitude toward any prize, X, we interview or observe the 
person to find out the probability P. at which he is indifferent between 
the standard lottery ticket, S(P.), and X. We then evaluate the utility 
of X by using the standard N-M rule, Equation (1), to determine the 
utility of S( Pz). That is all there is to it. 

But where does the N-M prediction come in? Suppose we have two 
lottery tickets Lı and Lz and we wish to predict which of these our in- 
dividual prefers. If ticket Lı offers alternative prizes A and B, and L; 
carries with it prizes C and D, we find the utilities of each of these prizes 
in turn by the procedure which has just been described. From these figures, 
in turn, we can evaluate, by the N-M calculation [Equation (1)], the 
respective utilities, U (Lı) and U(L;), of Lı and L. We then have the 
prediction that the person will prefer the lottery ticket with the larger 
calculated utility number. 

Observe what has happened here. In order to assign utilities to the 
(riskless) prizes, we did have to interview or observe the person in question. 
But once he has committed himself on these we need ask him no further 
questions in order to predict his ranking of any lottery tickets in which 
only these prizes are involved. We do not have to ask him how he feels 
about the odds involved in these tickets—this can be determined for him 
from our computation. 


4. Expected Utility vs. Expected Payoff 


One feature of the N—M utility convention (1) should be pointed out. 
According to this rule a lottery ticket is evaluated at the actuarial (ex- 
pected) value of its utilities, not at the actuarial value of the prizes them- 
selves, as one might more usually be tempted to do. This assertion, which 
may not be clear to the reader, is most easily explained by example. Con- 
sider a lottery ticket whose prizes, A and B, are amounts of money. Let 
these amounts and their respective utilities be the figures shown in the 
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following table: 


Prize (dollars) 
Utility of prize (utils) 


Probability 


A standard actuarial evaluation of this lottery ticket is 
500P + 2(1 — P), 
so that, e.g., if P = 4(50-50 odds) this ticket’s actuarial value wili be 
3500 + 12 = 251 dollars. 


But in Neumann-Morgenstern utility analysis it is necessary to translate 
the prizes into utility terms before one can evaluate the ticket. In N-M 
analysis the ticket would then be valued at 40P + 2(1— P) or 


140 + 32 = 21 utils. 


There is something inherently attractive in the latter procedure. The 
rational individual may be taken to be interested not in the money value 
of a prize, but in just how much winning it will mean to him (its utility) 
For example, a prize of $10,000 is ten times as large as a $1,000 prize, but 
if he needs the $1,000 very badly the utility of $10,000 may, in some sense, 
not be quite ten times as high. In evaluating the lottery ticket he should 
surely take this into account. 

In fact, diminishing (or increasing) marginal utility can easily affect 
the person’s attitude toward a lottery ticket. Suppose that $0 is evaluated 
at 0 utils by some individual, that $50 gives him 60 utils, while a second 
$50 yields him only 40 more utils. We have the following utility table which 
thus clearly involves diminishing marginal utility: 


Prize (dollars) 0 50 100 
Utility (utils) 0 60 100 


Now consider a lottery ticket which offers a 50-50 chance of zero or 
$100. Its actuarial value is, of course, (3)100 + (3)0 = $50. But to an 
expected utility-maximizer it is worth only 50 utils, which according to 
the table is far less than the value of $50. That is, this person will be willing 
to pay much less than $50 (its actuarial value) for the lottery ticket. 
Actually, this makes good common sense. If the ticket costs him $50, he 
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stands a fifty-fifty chance of either winning or losing $50. But, if to him 
the marginal utility of money is diminishing, an added $50 is worth less 
(40 utils) than the disutility of a $50 loss (60 utils). Hence, he should 
never accept the ticket in exchange for $50. This is an old observation 
which has been made a number of times by economists and mathemati- 
cians. It illustrates how results which are more plausible intuitively can 
be obtained from expected utility rather than from actuarial (dollar) 
calculations. 

Another example from a totally different field may help to bring out 
the difference between utility and nonutility calculations. Suppose we are 
trying to evaluate two alternative bombing strategies. One of them offers a 
1 in 10 chance of getting through and destroying 80 per cent of the enemy’s 
productive capacity. The other, more conservative strategy offers eight 
out of ten chances of destroying 10 per cent of this productive capacity. 
On a straight actuarial evaluation the two strategies are equivalent (0.1 X 
80 — 0.8 X 10). But in terms of a doubtless more relevant utility analysis, 
this is not necessarily so. For example, experience suggests that a 10 per 
cent loss in productive capacity is easily made up and will result in no 
real long-run difference in en military strength, so that it may be almost . 
worthless. On the other hand, an 80 per cent loss is likely to weaken him 
very substantially and may be of crucial military value, and a utility 
analysis might therefore definitely recommend this latter strategy over the 
other. We see, then, that in decision-making it seems more appropriate 
to use a calculation based on the utilities of the alternative outcomes 
rather than on the magnitudes of the outcomes themselves. It seems much 
more appropriate to maximize expected utilities than expected prize values.? 


8 Some critics, notably Professor Allais, have argued that although these calculations 
should be based on utility, they should include all the facts about the utility calculation 
and not just the expected value. For example, fifty-fifty odds of 100 and 200 (utils) have 
the same expected utility (150) as do fifty-fifty odds of 125 and 175. However,.it may 
be argued that since the former pair of utility payoffs is more widely spread out (dis- 
persion = 200 — 100 = 100) than is the latter (175 — 125 = 50), the former lottery 
ticket subjects the player to greater risk, and that, therefore, he need not be indifferent 
between the two. Expected value does not tell the whale story! 

However, it has been answered that the utility calculation already takes the disper- 
sion of the prizes into account. That is, if one lottery ticket offers a greater risk than 
another, its utilities are calculated in such a way as to discount for this fact. The utility 
of the riskier lottery ticket has already been reduced to take the risk into account. Hence 
if we make a second adjustment for the risk involved in dispersion of utilities, we would 
be double-counting. Those who take this position say that this is why the psychological 
assumptions described in the next section can be shown (see the appendix) to require the 
person always to pick the lottery ticket with the highest expected utility, no matter 
what the dispersion of utility payoffs. Since Allais disputes the appropriateness of these 
assumptions, of course this argument carries no weight with him. 
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5. Psychological Premises Behind the Prediction 


Let us return now to our main theme, the prediction which is obtained 
from a N-M utility calculation. We have seen how from a purely numerical 
manipulation we are able to make a forecast of human behavior—to 
predict which of two lottery tickets our individual will prefer. 

Clearly such a prediction need not always turn out to be correct. Its 
validity must rest on some sort of psychological assumptions, for only 
people of some particular psychological constitution will behave in accord 
with the N-M predictions. Those who have worked with the index are 
fully aware of this, and various sets of psychological premises have been 
formulated for the analysis. Let us now examine briefly a set of five assump- ' 
tions which suffice to produce an N-M psychology. The appendix to this 
chapter contains a proof which shows that someone for whom these five 
premises are valid must always do as the N-M calculation predicts. 


Assumption 1: Transitivity: If our individual is indifferent between 
two prizes A and B, and he also happens to be indifferent between B and 
C, then he will be indifferent between A and C. As we have seen in Chapter 
9, this assumption also plays a role in indifference map analysis, so that it 
is no more restrictive than the usual ordinal utility analysis. 


Assumption 2: Continuity of preferences: This is, in effect, the plausible 
assumption (which has already been employed) that if our standard lot- 
tery ticket, S(P), is preferred to some prize, A, when P = 1, and if on 
the other hand, A is preferred to S(P) when P — 0, there exists some 
in-between value of P at which S(P) and A are indifferent. 


Assumption 8: Independence: If our player is indifferent between a 
Ford and a Chevrolet, he will be indifferent between two lottery tickets 
which are identical in all respects except that one of them offers a Ford 
as a prize while the other offers a Chevrolet instead. This assumption is 
also taken to hold for lottery tickets, e.g., if the person is indifferent be- 
tween the Ford and a lottery ticket, R, which offers him a chance at a 
Rolls Royce, he must also be indifferent between two lottery tickets one 
of which offers a Ford and the other of which offers as a prize (the lottery 
ticket) R. 


Assumption 4: Desire for high probability of success: Given two lottery 
tickets with identical prizes, our individual will prefer the lottery ticket 
with the higher probability of winning. This assumption is so persuasive 
that it hardly seems worth stating but it has been pointed out that there 
are exceptions even to this premise. Players of Russian Roulette and, 
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sometimes, mountain-climbers seem to prefer to live dangerously! Finally 
we have 


Assumption 8: Compound probabilities: If the person is offered a lottery 
ticket whose prizes are, in turn, other lottery tickets, his attitude toward 
this compound lottery ticket will be the same as though he had gone 
through all the probability calculations to find out what ultimate odds of 
winning and losing this compound ticket really offers him. (Luce and 
Raiffa cite Kuhn’s example of very real compound lottery tickets, which is 
worth recalling here. All over Paris one sees wheels of chance whose prizes 
are, in turn, tickets in the French National Lottery.) 


Some controversy has arisen out of the third and fifth premises. It is, 
of course, clear that few people will, or even can, go through the elaborate 
calculations envisaged in the last assumption, and no one has ever claimed 
otherwise. But the question which has been raised is whether they even 
ought to do so as a matter of self-interest. To illustrate the sort of objection 
which has been raised, let us consider one which has been advanced against 
the 3rd (so-called independence) assumption. Many, if not most people 
will offer considerably less than $500 for a lottery ticket, T, which gives 
them a fifty-fifty chance of $1,000 or zero. Let us say that to one person 
T is worth $200 (he is indifferent between T and $200). Why is it worth 
so little to him? The answer, this argument asserts, is that he wishes to 
avoid risk. Two hundred dollars is a sure thing whereas T is not, so that 
he is willing to forego the difference between the $500 actuarial value of T 
and the $200 in hard cash to avoid the gambling element involved in 
taking T. 

But suppose we consider two lottery tickets Lı and Lz which differ 
only in that T is a prize in L, and $200 is the corresponding prize of Ls. 
Should he necessarily be indifferent between Lı and Lz as Assumption 3 
requires? The answer, says this argument, is no, because the $200 is no 
longer a sure thing—it has become a prize in a lottery ticket. In this case 
since the person is gambling in any event, he may well prefer Lı, which 
offers T (actuarial value $500) as one of its prizes, to L, in which the corre- 
sponding prize is only $200. For the second choice—the $200 option—no 
longer keeps him safe from risk. In other words, this argument maintains 
that the relative value of T and $200 is not independent of the context in 
which they are offered. They will be indifferent in one case but not in the 
other. 

It is not intended here to offer any judgment on the acceptability of 
the N-M psychological premises. Many economists consider them to be 
rather attractive assumptions but, as we have just seen, they have not 
gone unchallenged. The main thing to be recognized is that the validity 
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of the N-M predictions must rest on this or some other set of psychological 
assumptions. 


6. N-M vs. Neoclassical Cardinal Utility 


There remains one more subject to be explored—what relationship, if 
any, does the N-M cardinal utility theory have to that of the neoclassical 
utility theorists? It is generally (though not universally) agreed that there 
is none—the two utility measures have nothing in common insofar as their 
cardinality is concerned. 

It is not the purpose of the Neumann-Morgenstern utility index to set 
up any sort of measure of introspective pleasure intensity. Such a measure 
of “strength of feelings” is totally unnecessary in the theory of games for 
which the N-M utility theory was constructed. Rather, the utility mea- 
sure was set up for purposes of calculation, or rather of prediction (in the 
subtler sense of the word), to permit the theorist to determine in the absence 
of the player which of several risky propositions the player will prefer, for 
the solution of a duopoly game must predict which strategy each player 
will choose (prefer), and the theory must therefore be able to predict 
how each player will rank risky strategic decisions. 

Then where does the “cardinal utility measurement” enter this matter? 
The answer is that the word “cardinal” has been used, misleadingly, to 
mean two entirely different things. One denotation is the neoclassical, 
introspective, absolute marginal pleasure measurement. The other, game- 
theoretic, use of the word “cardinality” is entirely operational. The pre- 
diction as to which of the two lottery tickets will be chosen is most con- 
veniently made with the aid of a numerical calculation. We are given the 
person’s ranking of the prizes and intend to predict from these data which 
ticket he will choose. For this purpose N and M have constructed an index 
far more powerful than the ordinalists’. 

But note that this is not cardinal utility in the old-fashioned sense. 
Not a word has been said about successive increments of some item yielding 
diminishing (or increasing) marginal joy. Indeed, to a strict neoclassivist 
the N-M index is a sheep in wolf's clothing—to him (but not to the mathe- 
matician) it is nothing but an ordinal measure, for while it can be used to 
predict, it can predict only rankings of lottery tickets! 

It is true that once we have derived a numerical N-M utility index we 
can use it to compute numerical marginal utilities? and some of the other 
measures encountered in neoclassical utility theory, measures which dis- 


° For example, if U = f(M) is the N-M measure of the utility of money, M, to our 
individual, we can compute the marginal utility of money to him as AU/AM or dU /dM. 


432 Neumann-Morgenstern Utility Theory Chapter 17 


appear in the ordinalist’s analysis. But this kinship between the two car- 
dinal theories is also illusory. It will be recalled that the ordinalist is per- 

: fectly happy to use a concept of marginal utility of X measured in terms 
of money (he calls it the marginal rate of substitution between X and 
money), or, for that matter, marginal utility of X measured in terms of 
any other commodity. He objects only to an introspective evaluation of 
marginal utility in absolute psychological units. 

But the marginal utility which can be derived from a N-M measure- 
ment is just this sort of marginal rate of substitution. To evaluate the 
marginal utility of apples in money terms we would ask, “How much more 
money are you willing to pay for an additional apple?" In N-M theory we 
ask, instead, “How much of an increase in the probability, P, of winning 
E in our standard lottery ticket, S(P), is worth the same as an additional 
apple?" The N-M marginal utility of X therefore ends up as no more 
than the marginal rate of substitution between X and the probability of 
winning the prespecified prize (E) of the standard lottery ticket. This is 
surely not cardinal measurement in the neoclassical sense. 


APPENDIX: THE PSYCHOLOGICAL PREMISES AND THE INDEX 


Let us now prove that the calculation of Section 3 of this chapter will 
predict correctly the lottery-ticket preferences of any person who satisfies 
the psychological assumptions of Section 5. 

First, it is necessary to restate these assumptions in somewhat greater 
detail. For this purpose it is convenient to employ the following notation: 
For any alternatives A and B let AIB mean A is indifferent with B, and 
for any alternatives A and B and any (probability) number P where 
0 < P < 1, let [P:A, B] represent a lottery ticket which offers the prob- 
ability P of obtaining prize A, and 1 — P of obtaining prize B. Using this 
notation, our assumptions become: 


Assumption 1: Transitivity: If for this person AJB and BIC, then AIC. 


Assumption 2: Continuity of preference as a function of P: For any 
three outcomes, E, A and D, if E is preferred to A and A is preferred to 
D, there exists a (probability) number P, such that 0 < P, < 1 and 
Al[Pa:E, D]. 

Assumption 3: Independence: For any four prizes, A, B, C, and F, if 
AIB and CIF, then [P:A, C]J[P:B, F] for any probability P. 

This states that if two investments (lottery tickets) involve equal 


probabilities of attaining outcomes which are different but which are 
valued equally, then the two investments will be equally attractive. 


| 
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Assumption 4: For any alternatives E and D, and any probability 
numbers r and r', if E is preferred to D, then [r:E, D] is preferred to 
[r’: E, D], if and only if r > r’. 


This states that, other things being equal, we will always prefer the 


investment opportunity with the greater probability of a favorable 
outcome. 


Assumption 5: Compound, probability arithmetic: For any alternatives 
E and D and any probability Wiumbers P, Pa, and Py, 


[P:[Pa:E, B], [P E, DM: E, D], 


where r is a probability number given by r = PP, + (1 — PP. 


This requires some explanation. The long bracketed expression to the 
left of the I represents à compound lottery ticket which offers the prob- 
ability P of winning. If the ticket-holder wins, however, rather than a 
definite prize he obtains another lottery ticket [P,:E, D]. If he loses, he 
is given, instead, the inferior lottery ticket [P,: E, D], with the same prizes 
but poorer odds. What is the probability of eventually coming out of all 
this with the grand prize, E? There is a probability P of winning the better 
lottery ticket which offers E with probability Pa, so the probability of 
getting E in this way is PPa. However, if he loses the first draw, a loss 
which will oceur with probability 1 — P, the ticket-holder still has the 
probability P, of getting E, so that there is a probability (1 — P)P, of 
his obtaining E in this way. The total probability of obtaining E is then 
PP, + (1 — P)P,, which we have called r. 

We may now interpret this last assumption to say that the person's 
psychology is such that he will evaluate a compound lottery ticket in 
terms of the probabilities of winning the ultimate prizes. 

Let us now choose the lottery ticket to be used as a standard against 
which other alternatives can be evaluated. It is still convenient (but not 
necessary) to assume that this ticket offers to the winner eternal bliss 
(E) and to the loser damnation (D), so that any alternative, A, which we 
bring to be evaluated against this standard ticket will presumably be no 
better than E and no worse than D. 

By Assumption 2, for any such A, there will be a probability number 
P.(0 < P, < 1) such that AJ[P,: E, D]. We can now prove the following: 

Theorem 1: Possibility of predicting: Given any two lottery tickets 
[P: A, B] and [P': A’, B’] and a person whose preferences never violate 
Assumptions 1-5, if we obtain (say by his introspection) the four prob- 
ability numbers Pa, Pa’, Ps, and Py chosen so that 


(2) AI[P,:E, D] and BI(Ps: E, D], ete., 
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then from these numbers it is possible to predict which of the two lottery tickets 
will be preferred. 


Proof: We begin by evaluating our first lottery ticket in terms of E 
and D. This we can do by replacing A and B by their equivalents in terms 
of our standard lottery ticket, to obtain 


[P:A, BJH[P:[P.: E, D], [P*: E, DJ] (by Assumption 3) 
2 [P:A, BJ [r: E, D] (Assumptions 1 and 5), 


where r is the probability number PP, + (1 — P) P». Similarly, the second 
lottery ticket can be evaluated in terms of E and D as 


[P':A' B'M[r:E,D] where r = P'P.,-(1-— P’) Py. 


Therefore, by Assumption 4, the individual must prefer [P:A, B] to 
[P’: A’, B']if and only if 


(3) r= PP; + — P)Py > P'Py + (1 — P)Py =", 


and he will be indifferent between these tickets if and only if r = r’. But 
by hypothesis, P, P’ are numbers given by the terms of the two lottery 
tickets, and Pa, Ps, Par, and P» were found out by observing or questioning 
our individual. Then r and 7’ can be evaluated directly and the higher of 
these two numbers must, by Assumption 4, correspond to the preferred 
lottery ticket. (Q.E.D.) 


Let us now see how the N-M index is constructed and prove that it 

can be used to predict correctly the choice of lottery ticket. As already 
indicated, we employ the following linguistic convention (definition) for 
evaluating the utility of a lottery ticket in terms of the utilities of its prizes: 
(4) U[P:A, B] = PU(A) + (1 — P)U(B). 
That is, if P = ł so that the odds of winning are 3 to 1, we evaluate the 
utility of the lottery ticket at three-fourths the utility of victory plus one- 
fourth the utility of defeat. But we note again that this is only a convention. 
To show that it is usable we must first restate, in terms of our present 
notation, how these utility numbers can be found, and then we must prove 
that they must always assign a higher utility number to the preferred 
lottery ticket. 

To find the utility of any alternative, A, we first assign arbitrary 
“utility” numbers 


(5) U(E) > U(D) 
to eternal bliss (E) and damnation (D) in our standard lottery ticket. 
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Now we find U(A) by recalling (2) and defining 
(6) U(A) = U[Pa:E, D], U(B) = U[Ps: E, D], etc., 
So that by (4) 

U(A) = P,U(E) + (1 — P3)U(D), ete. 


Hence by finding P, in (2) the utility number U(A) can be computed. 
Finally, let us prove 


Theorem 2: Validity of the prediction: These utility numbers rank 
lottery tickets correctly so that U[P:A, B] > U[P': A’, B'] if and only 
if the former is the preferred lottery ticket, i.e., if and only if (3) holds.!? 


Proof: The utility of the first lottery ticket is 
U[P:A, B] = PU(A) + (1 — P)U(B) [by convention (4) ] 
PU[P.:E, D] + (1 — P)U[Ps: E, D] [by (2) and (6)] 
P(P,U(E) + (1 — P.)U(D)} 
+ (1 — P){P,U(E) + (1 — Py) U(D)} [by (4)], 
which gives on multiplying out and rearranging terms 
= {PPa + (1 — P)PJ U(E) 
+ {PQ — Pa) + (1 — P — P9)UQD) 
= (PP. + (1 — P)P} U(E) 
+ {1 — PP, — (1 — P)P} U(D) 
(7) = rU(E) + (1 — r)U(D), 


where r is defined as in (3), above. Similarly, the utility of the second 
lottery ticket is 


(8) ULP': A’, B'] = r'U(E) + (1 — r’) U(D). 


Thus comparing (7) and (8) we sve that since by (5) U(E) > U(D), the 
first lottery ticket will have the higher utility number if and only if r > r. 
But we have just seen (3) that this condition also guarantees that that 
lottery ticket will be preferred. Thus we have proved that convention (4) 
will always assign a higher utility number to the preferred lottery ticket, 
as we require. 


1° The proof can easily be extended to the case of indifference. 


436 Neumann-Morgenstern Utility Theory Chapter 17 


REFERENCES 


Alchian, Armen A., “The Meaning of Utility Measurement,” American Economic 
Review, Vol. XLIII, March 1953. 

Ellsberg, Daniel, ‘Classical and Current Notions of Measurable Utility,” Economic 
Journal, Vol. LXIV, September 1954. 

Friedman, Milton, and Leonard J. Savage, “The Utility Analysis of Choices In- 
volving Risk," Journal of Political Economy, Vol. 56, August 1948, reprinted 
in American Economic Association, Readings in Price Theory, J. G. Stigler 
and K. E. Boulding (eds.), Richard D. Irwin, Inc., Homewood, Ill., 1952. 

Luce, R. Duncan, and Howard Raiffa, Games and Decisions, John Wiley & Sons, 
Inc., New York, 1957, Chapter 2. 

Strotz, Robert, “Cardinal Utility," American Economic Review, Vol. XLIII, May 
1953. 


More Difficult Readings 
Allais, Maurice, Fondements d'une théorie positive des choix comportant un risque et 
critique des postulats et aziomes de l'école américaine, Imprimerie Nationale, 
Arrow, Kenneth J., Essays in the Theory of Risk Bearing, Markham Publishing 
Company, Chicago, 1970, Chapter 2. 


Borch, Karl, The Economics of Uncertainty, Princeton University Press, Princeton, 
N.J., 1968. 


Herstein, I. N., and John W. Milnor, “An Axiomatic Approach to Measurable 
Utility," Econometrica, Vol. 21, April 1953. 


Marschak, Jacob, “Rational Behavior, Uncertain Prospects and Measurable 
Utility," Econometrica, Vol. 18, April 1950. 

Samuelson, Paul A., “Probability, Utility, and the Independence Axiom,” Econo- 
metrica, Vol. 20, October 1952. 


Von Neumann, John, and Oskar Morgenstern, Theory of Games and Economic 
Behavior, 2nd edition, Princeton University Press, Princeton, N.J., 1947, 
Chapter 1 and Appendix. 


Game Theory* 


l8 


1. Taking Account of Competitive Decisions 


One of the most vexing and persistent problems of the businessman is 
that of outguessing his rival. If only he could calculate in advance what 
the competition was going to do, his planning would become far easier 
and more effective. 

As we have seen in Chapter 16, this problem can be dealt with in a 
variety of ways. The simplest approach is applicable where experience 
with the behavior of a competitor makes it relatively easy to predict his 
strategies. Where such information is available, it is, in effect, possible to 
choose that decision which maximizes the firm's expected return after the 
effects of the rival's countermoves are taken into consideration. Procedures 
which resemble this are frequently encountered in business practice. 

But it is often against the competitor's interests to permit this sort of à 
calculation. Management may therefore avoid too obvious a pattern in 
its decision-making in order to keep the opposition guessing. When it 
succeeds in this goal, no such simple prediction of competitive behavior 
will be possible. At best, one may be able to say something such as, “The 


* There is a considerable literature on this subject. The classical source is, of course, 
John von Neumann and Oskar Morgenstern, Theory of Games and Economic Behavior, 
2nd edition, Princeton University Press, Princeton, N. J., 1947. For a superb exposition 
and further references, see R. Duncan Luce and Howard Raiffa, Games and Decisions, 
John Wiley & Sons, Inc., New York, 1957. 
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tion is also valid, i.e., that any equilibrium pairs of strategies are necessarily 
minimax-maximin strategies.” 


Property 4. Equality of payoffs of different equilibrium pairs: A payoff 
table may possess more than one equilibrium pair of strategies. Suppose 
we call them (a, b) and (a’, b’). However, since by the preceding property, 
both a and a’ must be maximin strategies for A, they must yield the same 
security payoff to A. Similarly b and b/ must yield the same security value 
to B. In other words, any of the four strategy combinations (a, b), (a’, b’), 
(a, b^), and (a’, b) must yield the same payoffs. Therefore if there are 
several equilibrium pairs of strategies and if one of the players picks any 
such strategy, then the other player can achieve the same degree of protec- 
tion (minimum payoff) no matter which of these he chooses. 


Property 5. Maximin strategies may be poor countermoves to non- 
minimaz strategies: The minimax strategy also has an important unattrac- 
tive feature. Suppose one of the firms is run by managers who are poorly 
informed or are not very clever, or who simply are willing to take risks, 
and who, for any of those reasons, do not employ a maximin strategy. 
Then, the maximin strategy is likely to be unprofitable to the other firm. 
For example, if B employs strategy 1, then it will be strategy 3 which 
yields A its highest market share (64 per cent) and A’s maximin strategy, 
1, will not do as well. In other words, the prudent maximin strategy is 
only guaranteed to be good when playing against another prudent man! 


5. Geometry of Equilibrium Points: Saddle Points 


A’s payoff as a function of his and B’s strategy choices can be repre- 
sented in a three-dimensional diagram such as shown in Figure 1. In order 
to obtain a smooth diagram it has been necessary to assume that both A 
and B have entire (continuous) ranges of strategy choice open to them, 
involving decisions such as the prices to be charged for their projects or 
the amounts to be spent on advertising. Thus either player has an infinite 
number of choices open to him (one corresponding to each point on his 
strategy axis). This contrasts with situations involving only a finite num- 


2 For if b is B’s most profitable move against a, this combination of strategies must 
yield to A his smallest possible return from a, i.e., his security-level evaluation of strat- 
egy a, call it S(a). Now, consider any other one of A’s strategies, and call it a’. The 
security value of a’, that is, S(a’), is the minimum yield of a’ so that the combination of 
strategies a’ and b must yield a payoff, c(a’, b), which is at least as large as S(a’). Hence 
c(a’, b) > S(a’). Moreover, since a is A's most profitable countermove to b, the com- 
bination (a, b) must pay A at least as much as does combination (a’, b), i.e., we must 
have S(a) > c(a’, b). Comparing the two inequalities, we see that S(a) > S(a’), i.e., no 
other strategy, a’, has a security value greater than that of a. Hence a must be A's 
maximin strategy. A similar argument shows that b must be B's minimax strategy. 
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ber of strategy possibilities (such as the three strategies assumed available 
to A and four to B in the previous payoff matrix) to which most theorems 
of game theory refer. 

In this diagram A’s market share is shown directly by the height of the 
surface above any point on the floor of the diagram (representing a com- 
bination of strategy choices by A and B). B's market share can be inferred 
just as easily, since whatever per cent of the market is not held by A must 
be in B's hands. B's share of market surface is thus what is left of the 
three-dimensional cube in the diagram after A's share of market surface 
has been removed. 

It will be noted that A's surface has been drawn to involve two very 
special characteristics. It has the crest of a hill running roughly north to 
south. and the trough of a valley running roughly east to west. The valley 
is the locus of security levels for A's strategies. For example, if he chooses 


A's 
t 
Ma ae 
i N 3 


Figure 1 


strategy a in tne diagram, the worst payoff he can possibly receive is 
M'M, the payoff at the point where line aa’ (the line which represents 
Strategy a) falls under the trough or valley line of the payoff surface. 
Similarly, the crest line is the locus of B's minimal payoffs (ie., the 
maximal payoffs to A for any of B's possible strategy choices).. 

The altitude of the point of intersection, M, between the trough and 
the crest lines is A's payoff from the combination of A's maximin strategy 
a and B's minimax strategy b. A's maximin strategy a is his maximin 
strategy because that is where the trough line reaches its highest point, 
i.e., where it crosses the crest of the hill. For similar reasons b is B's mini- 
max strategy. Thus, M is an equilibrium point and M’ represents the 
equilibrium pair of strategies a, b. : 

Because the graph of the payoff surface has the shape of a somewhat 
distorted saddle, a point such as M, which is the intersection of a hill 
crest and a valley trough, is called a saddle point (see also Figure 1 of 
Chapter 6). 
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6. Payoff Matrices Without Equilibrium Points 


We have yet to deal with the case where A’s share of market payoff 
matrix has no saddle point. An illustration is provided by the following 
payoff matrix for A: 


Here A has a maximin strategy, 2, because 40 is the maximum of the 
lowest numbers in the two rows. Similarly, B has a minimax strategy 1 
which will guarantee that A obtains a market share no larger than 80 per 
cent. This is not a saddle point because the maximin strategy combination 
(2, 1) and the minimax combination (1, 1) do not coincide. The lowest 
point on the crest of the surface is at a different location from the highest 
point in the valley. Note that if the maximin-minimax combination of 
strategies is employed, A will obtain 40 per cent of the market so that B 
will be pleasantly surprised. 

Even in this case, the maximin-minimax procedure will still be the 
coward’s strategy. By definition, it provides maximum protection against 
one’s competitor. 

But now this strategy pair lacks the second attractive feature which it 
possessed in the saddle-point case. It is no longer an equilibrium pair. 
That is, if B is certain to employ his minimaz strategy 1, A is better off 
employing his nonmaximin strategy 1, which will raise his market share 
from 40 to 80 per cent. It is also easy to see that B will want to change his 
strategy if A changes from his maximin to his nonmaximin strategy, and 
that A’s best counterstrategy will depend, equally, on B’s strategy choice. 
Thus, in the absence of a saddle point, the choice of strategies becomes a 
highly unstable affair. 

This does not mean that maximin or minimax strategies are now neces- 
sarily undesirable. Especially when A and B are highly uncertain of one 
another’s plans, they may still both prefer to play it safe and to stick to 
this decision no matter what the risky temptations. 


7. Mixed Strategies 

There is another interesting type of strategy alternative open to A 
and B. Its analysis will require the aid of the utility theory developed in 
Chapter 17. It turns out that if the number of possible pure strategies 
with which we began is finite, this alternative type of strategy has the 
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effect of replacing A’s share of market payoff matrix with another which 
has the following remarkable property: If the original share of market 
surface had a saddle point, the new surface will also have one at the same 
location. And even if the original share of market surface had no saddle 
point, the new one always will! 

The way in-which this additional set of strategic possibilities enters 
can be indicated in a somewhat roundabout manner. It was remarked at 
the beginning of the chapter that good business policy will often seek to 
prevent the competition from predicting one’s own strategy. One way 
of doing this is to choose one’s strategy randomly, e.g., to pick two fairly 
good strategies and choose between them by the toss of a coin or some 
other chance device such as a spinner. The combination of the two strategies 
and the probabilities assigned to the two strategies? is itself called a mixed 
strategy, as compared with the pure strategy which involves no such random 
elements. 

Mixed strategies can easily be shown to have a very interesting prop- 
erty. They can often increase the security levels available to both com- 
petitors when the pure-strategy payoff matrix has no saddle point. This is 
easily demonstrated with the aid of the preceding payoff matrix. 

Suppose that the payoffs in the table, instead of representing share of 
market, are measured in utility.terms. We can then compute the (expected) 
N-M utility of a mixed strategy to compare its value with that of a pure 
strategy. 

Consider the mixed strategy which offers A a 1-to-3 chance of having 
to employ strategies 1 and 2, respectively. If B employs his strategy 1, 
the expected value of the utility of the outcome to A will be 180 + 240 = 
50. On the other hand, if B employs his other strategy, 2, the expected 
value of A's payoff is 120 + $100 = 80. Either of these outcomes exceeds 
the 40 guaranteed to A by his maximin pure strategy, and so, simply by 
turning his decision into a gamble, A has increased the level of protection 
which is available to him. In the same way, B can also increase the value 
of his minimum payoff by the use of mixed strategies. 

There appears to be an element of sleight of hand in this procedure, 
and some writers have questioned the value of mixed strategies except 
as a means of confusing the competition. The businessman in question must 
always assume the worst in situations in which his competitor is not kept 
in the dark by the use of a random decision device, but he must accept 
a more temperate evaluation of his own prospective payoff when his own 


3 The odds can be set any way the player prefers to have them. For example, if the 
pointer in a spinner can fall on any number from 1 to 12, the odds are set at 2 to 1 
in favor of strategy 1 by deciding to play strategy 1 if the pointer falls anywhere on 
1 through 8 and to play strategy 2 otherwise. Since there is an infinite number of pos- 
sible scts of odds, the number of mixed-strategy alternatives open to a player is infinite. 
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decision is made into a gamble. If the individual were to adopt as pessimistic 
a view of the outcome of a mixed strategy as he does for a pure strategy, 
he would act as though he were always sure of losing and a mixed strategy 
could not increase his degree of protection.* 

Mixed strategies have proved advantageous to A by enabling him (only 
in his anticipations) to take his eggs out of one strategy basket. Actually, 
however, by using such a strategy he has opened himself to the worst of 
contingencies. If luck is against him, and B uses strategy 2 while against 
3-1 odds he is led to employ 1, his payoff will be only 20, the lowest possible 
payoff in the table, which is well below the 40-util payoff to his maximin 
pure strategy. The mixed strategy has really provided him with no ab-olute 
protection in the ultimate outcomes, unlike his pure maximin strategy. 
But, in return, the mixed strategy has permitted him the luxury of antic- 
ipating far more desirable outcomes, e.g., the 100 payoff which the maximin 
pure-strategy approach would have required him to keep out of his calcu- 
lations altogether. 


8. Optimal Mixed Strategies and the Saddle-Point Theorem 


The theorem that there must be a saddle point on the surface which 
represents the payoffs from a finite number of pure strategies and, in 
addition, all possible mixed strategies into which these can be combined, 
was first proved by J. von Neumann. It has been called the fundamental 
theorem of two-person, zero-sum game theory. 

To discuss the derivation of this theorem the concept of the optimal 
mixed strategy must first be described. The odds attached to the different 
pure strategies which make up the mixed strategy need not be chosen 
arbitrarily. By changing the odds in our previous illustrative computation, 
it is easy to show that different mixed-strategy odds yield different maxi- 
mum security levels for a player. He should therefore look for a set of 
probabilities for his own mixed strategy which will make his security level 
as large as possible.? This so-called optimal mixed strategy will set the 


4 Such a person, in fact, would not possess the second characteristic of the expected- 
utility maximizer as listed in Chapter 17. To him the value of a gamble will not vary 
with the magnitude of the probability of winning. A ten-to-one chance of making $50 
or $5 would be worth exactly the same to him as a similar bet at fifty-fifty odds— 
both would be worth exactly $5. This again suggests that the extreme pessimism of a 
maximin strategy may not always be appropriate. E: s . 

5 This is the point where the Neumann-Morgenstern utility theory is required. We 
wish to find which of the infinite number of possible mixed Strategies is preferred by 
the maximining player. But the utility axioms tell us that he will prefer the one for which 
the expected security (utility) value is highest. We therefore compute the general ex- 
pression for this expected value in accord with the procedures of the utility theory, 
and are then in a position to use linear programming methods to find the odds which 


maximize this expected value, in the manner described below. 
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highest possible floor, L, under A’s expected earnings. If A plays this 
mixed strategy, the best B can hope to do is keep A’s expected earnings 
down to level L. 

If we go back to the first payoff matrix in this chapter for an illustra- 
tion, the probabilities which constitute this optimal mixed strategy can 
be found with the aid of the following linear program: Let q1, q2, and qs be 
the probabilities which player A will assign to the pure strategies one of 
which is to be chosen by his random device. Then he seeks those values 
of the q’s which maximize L, the floor to his payoff. Since the q’s are prob- 
abilities, their sum must equal unity and none of them may be negative. 
We also require (in order to be sure L is a floor) that no matter what his 
opponent does, A’s expected payoff is at least L. The symbol qı represents 
the probability that A will be told by his random device to employ his 
pure strategy 1, so that, if his opponent plays his strategy 1, A will have 
the probability of qı of receiving the corresponding payoff, which we denote - 
by P1; = 50 (our first payoff matrix). If his opponent plays strategy 1, he 
can also expect payoff P12 = 27 with probability q2, etc. 

Considering all these possibilities together, we see that if B plays his 
strategy 1, A’s payoff must turn out to be one of the numbers 50, 27, or 
64 (= Pi) and A's expected payoff will then be 50 multiplied by the 
probability, qı, that he will end up with strategy 1 plus 27 multiplied by 
the odds on strategy 2, qz, plus 64qs. A will seek g’s which guarantee that 
this sum is no less than L. Similarly, if B were to play his strategy 2, A’s 
expected payoff would be 90q: + 5g; + 30gs, and this, too, must be no less 
than L. Similar conditions must also take care of A’s expected payoff if B 
should employ either strategy 3 or 4. Taking all of these constraints into 
account, we see that the determination of A’s optimal mixed strategy 
constitutes the following linear programming problem: 


Maximize L (the floor to A’s payoff) subject to 
50g, + 27q + 64g — L > 0) 


90g + 5q2 + 30g; — L > 0| These conditions state that A's ex- 


pected earnings are never less than 
18g + 9q2 + 129g, — L > 0| L, i.e., that L is truly a floor. 


25g + 95g + 20g  L 20 


gt+oeto=l1 ) 


| These conditions must be satisfied 
"Ur oj the q’s to be probabilities. 

Since whatever B gains A must lose, B will be equally anxious to mini- 
mize A’s expected payoff. His optimal mixed strategy will impose on 
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A’s expected maximum payoff (the most A can get for himself) a ceiling, 
S, which is as low as possible. It is easy to check that the probabilities, 
vı, va, va, and v, which accomplish this are found with the aid of the dual 
of the preceding program: 


Minimize S subject to 


50», + 90», + 18v + 254 — S < 0 
27» + 5 + 9v + 95u — S<0 
64v, + 30», + 12» + 20, — S < 0 


v + v + v + % = 1 
v > 0, v > 0, v; > 0, u > 0. 


We can now apply the two duality theorems of Chapter 6, Section 3, 
to this zero-sum, two-person game.° Theorem I tells us that if both players 
employ their optimal mixed strategies, then L, the highest floor which A 
can set under his payoff, will be equal to S, the lowest ceiling which B can 
place over A’s earnings. This means that there will always exist a pair of 
mixed strategies which constitute an equilibrium pair in the sense that 
neither player can do any better for himself when the other employs his 
optimal mixed strategy. A can guarantee himself no more than L = S, 
and B cannot force him to take less. This is the fundamental theorem of 
the zero-sum, two-person game. This theorem paved the way for most 
further game-theory analysis. It may be added that von Neumann's original 
proof is much more complicated than that which was just outlined.” 

Our duality Theorem II also has an interesting game-theoretic inter- 
pretation. Since both players know the payoff figures, player B can, of 
course, compute A's optimal mixed strategy and its expected yields just 
as well as can A. We would, therefore, expect B not to play any strategy 
which offers to A more than his minimum expected return, L = S. Duality 
Theorem II tells us that this is precisely how he will behave if his strategy 
is optimal. For suppose the probabilities of A's optimal mixed strategy are 


6 The duality theorems of Chapter 6 must be extended somewhat for the present 
purpose since our pair of dual programs contain not just inequality constraints but also 


the equations 
qd dq 1 and v +ue+otu=1 
(which, as they stand, contain no slack variables). ; 
? Actually, for this argument to constitute the outline of a proof, it is necessary to 
show that the pair of dual programs in question possess optimal solutions. 
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such that strategy 1 turns out to be a poor one for player B; that is, if B 
plays strategy 1, A’s expected earnings will be greater than the minimum, 
L, to which B can force them. This means that in the first constraint con- 
dition in A's linear program the expected value of A's payoff, 50g; + 
27q2 + 6493, will actually be greater than L. Duality Theorem II then 
tells us that the optimal value of vı, the corresponding variable in the dual 
problem, will be zero. But, it will be recalled, v, is the probability that 
when B spins his mixed-strategy spinner it will tell him to play pure 
strategy 1. In other words, duality Theorem II tells us that the linear 
programming calculation will assign such odds to B's mixed strategy that 
he takes no chance on his random device, leaving him stuck with undesirable 
strategy 1. More generally, we see, then, that duality Theorem II always 
assigns probability zero to an inferior pure strategy. An optimal mixed 
strategy will automatically ensure the player against any risk of employing 
a pure strategy which enables his opponent to do well. 

The programming view of the two-person, constant-sum game yields 
one other significant observation. We would normally expect that in an 
optimal solution a number of slack variables will take zero values. The 
corresponding constraints must then become equalities, i.e., the correspond- 
ing expected yields are all exactly equal to L. The economic interpretation 
is that a player picks the odds in an optimal mixed strategy in a way 
which offers his opponent little real choice—the opponent can choose 
among a number of strategies, but, typically, most of these offer him exactly 
the same payoff (the rest offer him even less since they give the first player 
more than L).9 


5 Where the payoff matrix has exactly two rows and two columns and has no equi- 
librium point containing any pure strategy, it is clear that the equilibrium pair must 
consist of mixed strategies, i.e., all four numbers qı, q2, vi, and v2 must be positive. Hence, 
in a basic solution to such a problem all four slack variables must be zero. This states 
that the expected payoff to a player must be the same for both of his pure strategies if 
his Opponent uses an optimal mixed strategy. 

This observation yields an easy method for the determination of the optimal mixed 
Strategies in such simple (two-row, two-column) games. For example, in the second 
payoff table of this chapter the expected payoff to B's strategy 1 is 80g; + 40g2, and 
that to B’s strategy 2 is- 20g: + 100g;, where qı + q2 = 1, ie, ga = 1 — qi. If these 
expected payoffs are to be equal, we must have 


80g; + 40gs = 20g: + 100g: 
or 


80g: + 40(1 — qı) = 20g: + 100(1 — %), 
that is, 


80g: + 40 — 40g, = 20g: + 100 — 100g; or 120g, = 60 


80 that qı = 3 and the optimal mixed strategy for A involves gı = 3, qs = 4. The reader 
can check for himself that the optimal mixed strategy for B involves v; = 3, vs = 3. 
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9. Strategy; the Extensive and Normal Form of a Game 


There is a matter of interpretation which it is well to take up before 
proceeding further. The term “strategy choice” has been employed in a 
way which makes it appear to denote a single move. However, as the con- 
cept is used in game theory it has been interpreted to mean a great deal 
more. The strategy really becomes an extensive book of rules indicating 
what the player intends to do, in every contingency, from the beginning 
of the game to the end. Thus the strategy commits the player to an entire 
sequence of moves which is contingent in a fully specified manner upon 
what is done by the other player. 

This clearly artificial device serves to collapse the entire game into 
two strategy choices, one by each player. Once these choices have been 
made, the subsequent history of the game is completely determined. One 
can, in principle, figure out all the rest. Of course, only in the simplest of 
game situations can a player even be conceived of as thinking in terms of 
strategies in this extreme sense. 

Some studies have been made of games considered move by move. 
Such an analysis is said to deal with games in their extensive form. How- 
ever, the bulk of the literature discusses games in the collapsed form that 
uses the strategy concept, referred to as games in normal form (meaning 
that the games have been rewritten in accord with this convenient mathe- 
matical norm—not that games are normally played in this way!). 


10, Two-Person, Nonconstant-Sum Games 


So far the discussion has dealt only with coastant-sum games, i.e., only 
with games in which the behavior of the players has no effect on their 
combined payoff. Real economic problems are usually of the nonconstant- 
sum variety. For example, collusion can normally increase the total profits 
of a pair of duopolists, and two countries can usually do better by getting 
together than by declaring war on one another. Unfortunately, the theory 
is in a far less satisfactory state outside the area of the two-person, con- 
stant-sum game. . 

In the literature, nonconstant-sum games are divided into two classes: 
cooperative and noncooperative, i.e., into games where collusion does and 
those where it does not occur. 

In the cooperative case the game theorists have tended to argue that 
the players will be sufficiently rational to discover and make full use of all 
Opportunities which can be mutually advantageous. That is, the players 
are taken to cooperate on any and every action which can increase the 
payoff of either player (provided it does not, at the same time, reduce the 


i  - 
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payoff of the other). In the terminology of Chapter 16, Section 9, this 
states, then, that they will always end up somewhere on the contract curve. 

Of course, it is doubtful whether players are really so rational in prac- 
tice. Moreover, the problems involved in arriving at an acceptable division 
of the “take” may well prevent the players from maximizing their total 
loot as this rationality assumption requires! It is noteworthy that most of 
the novelty in the cooperative-case analysis occurs in investigation of the 
division of the spoils between colluding players. Nash has supplied a 
criterion for a reasonable or “fair” division which has been the subject of 
considerable attention and some criticism. ° 

Noncooperative, nonconstant-sum games will also be discussed briefly. 
They possess a number of interesting features: 


1. If such a game possesses several equilibrium pairs of strategies, they 
need not all yield the same payoff. Moreover, if (a, b) and (a’, b’) are 
equilibrium pairs, neither (a, b’) nor (a’, b) need be equilibrium pairs. 
Thus two properties of the zero-sum case no longer hold (cf. Section 4, 
above). This can greatly complicate the planning problems of both players 
since, if they do not aim for the same equilibrium pair, both may lose out. 

2. In the noncooperative, nonconstant-sum case it will often pay a 
player to publicize his plans, in marked contrast with the rather obvious 
advantage of secrecy in the zero-sum case. Disclosure may be useful either 
as a threat or as a means for transmitting information which permits a 
degree of tacit collusion: 


(a) Threat information: To a player who announces that he will 
drop a jar of nitroglycerine which will blow everyone up if he does not 
have his way, disclosure of this information is necessary for him to win 
his point. Curiously, a reputation for stupidity and stubbornness can 
be useful to the player who poses a threat because it will help convince 
the others that he really means it! Many mundane economic examples, 
such as strike threats, are easily cited.!° 

(b) Information for quasi-collusion: A company will often make 
certain that any price increases are well publicized in the hope or even 
the confident expectation that this move will soon be followed by other 


The Nash criterion states that if the status quo is (as a matter of convenience) 
evaluated at zero for both players, and if the players’ payoffs are evaluated by the 
recipients at u, and wa, then a fair division is one which maximizes the product of these 
utilities, uius. Nash derives this rather surprising arbitration formula from a set of 
axioms which are set up as reasonable criteria for a fair division of the spoils. A Zeuthen- 
Harsanyi model, which assumes that the bargainer who makes a concession is always 
the one whose percentage utility loss is the smaller, has also been shown to lead to the 
Nash solution. See John C. Harsanyi, “Approaches to the Bargaining Problem,” 
Econometrica, Vol. 24, April 1956. 

10 For a highly suggestive analysis of this and other related problems, see T. C. 
Schelling, The Strategy of Conflict, Harvard University Press, Cambridge, Mass., 1960. 
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firms in the industry, to their mutual advantage. Other examples will 
doubtless occur to the reader. 


3. Another peculiarity of the nonzero-sum, noncooperative case is that 
both players will often be led by self-interest to take decisions which are 
mutually disadvantageous. This has been illustrated sharply by a game 
called the prisoners’ dilemma, which is attributed to A. W. Tucker. Two 
prisoners are brought in and interrogated separately. Each knows they 
will both get off if neither prisoner “talks.” However, they are both told 
that if one confesses and the other does not the one who fails to confess 
will receive a particularly heavy penalty. In this situation both players 
may well decide to protect themselves by confessing. 

This point is of considerable economic importance. It shows why 
citizens may not contribute taxes voluntarily even though each wants the 
government to function—the citizen sees nothing to be gained by paying 
taxes unless there is some guarantee that others will contribute too, just 
as one prisoner will confess unless he has some assurance that his fellow 
prisoner will not do so. Similarly, many storekeepers will keep their shops 
open on Sunday although they all prefer a holiday, each fearing that if he 
does not do so he will lose customers to his competitors. This argument is 
involved in the logic behind conscription and rationing in wartime, govern- 
mental anti-inflationary measures, etc. All of these measures are designed, 
at least in part, to achieve the cooperation which alone can prevent the 
loss to each player from his trying to protect himself when he has no 
assurance that others will behave as required for their mutual interest.!! 


11. n-Person Games: Some Concepts 


Of most widespread potential economic application is the theory of 
many-person games, for most industries contain more than two firms, 
most real international trade problems involve more than two countries, 
and so on. But n- (many-) person games have so far proved rather intrac- 
table to analysis. Writings on the subject and results have been much fewer 
than in the case of the two-person, zero-sum game. Certainly there is 
nothing in n-person theory resembling the well-rounded analysis of the 
two-person case. "e M 

Nevertheless, the literature is rich in suggestive ideas—definitions and 
concepts rather than theorems. Some but not all of these concepts are 
matters of common sense and common observation and it is only remarkable 
that they were given little attention in pre-game-theoretic economic theory. 


i i ionale of govern- 
11 Indeed, I have suggested that this argument is central to the ration 
mental contol in a democratic society. See my Welfare Economics and the Theory of 
the State, G. Bell & Sons, Ltd., London, 2nd ed., 1965, esp. Chapters 7-9 and 12. 
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So far, in economic application, such suggestive concepts have been the 
most fruitful aspect of game theory—they have served to provide an 
illuminating way of looking at difficult problems rather than a source of 
cut-and-dried calculations. For these reasons the discussion of n-person 
games which follows is little more than a description of concepts and defi- 
nitions, and it is organized accordingly. These will, however, enable the 
reader to form an impression of the present state of n-person theory. 

In this theory, games are again divided into the cooperative and non- 
cooperative varieties. Only one result will be reported for the noncoopera- 
tive case. Nash has proved that every noncooperative game in which each - 
player has only a limited number of strategy alternatives open to him 
has at least one (mixed- or pure-strategy) equilibrium point. In other 
‘words, there exists at least one combination of mixed or pure strategies 
(a, b, c, +++, n) such that, if they are employed by players (A, B,C,---,N), 
respectively, it will be unprofitable for any one of these playei£ to switch 
to any other strategy. Thus there exist strategy combinations which have 
this self-policing feature: If all players but one follow this pattern, the 
self-interest of the remaining player will also lead him to stick to the 
equilibrium pattern. 

f However, an n-person game may possess more than one equilibrium 
point, and there may then arise the difficulties which were mentioned in 
the two-person, nonconstant-sum case: Different equilibrium points may 
yield different payoffs to the players, and if some players aim for one 
equilibrium point and the remaining players aim for another, they may all 
end up at a nonequilibrium point! Hence, in the absence of coordination of 
their plans, if a game possesses a number of equilibrium points, the players 
may find it difficult to attain any one of them. 

_ We now turn to a listing of the central concepts of the theory of n- 
person cooperative games: 


1. Coalitions. In the two-person game there is no possibility of several 
players combining against the rest. Such collusive arrangements can ob- 
viously arise in a many-person game. In game theory this sort of combina- 
tion of players is called a coalition. 

Obviously there are many cases where a coalition can add to the “take” 
of its members by successfully exploiting the remaining players. However, 
there are some games in which coalitions offer no net advantage to their 
members (an economic example might involve the costs of administration 
eating up the profits of any coalition). A game of the relatively uninter- 
esting variety in which there is no motivation for coalition formation is 
called an inessential game, as contrasted with essential games in which its 
members can benefit from the formation of a coalition. 


2. Side payments. Sometimes, in order for a coalition to maximize i 
u 
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returns, it may be necessary for a member to undergo some sacrifice. For 
example, a cartel may find it profitable to close the inefficient plant of one 
of its members rather than getting every member to reduce his scale of 
operations. In this case, in order to induce the short-changed individual 
to serve the interests of the coalition it is necessary to set up an equaliza- 
tion payment (bribe) for him. In game theory such a redivision of the 
spoils is called a side payment. 

3. Imputation. Any assignment of payoffs to the players is called an 
imputation if it meets two acceptability requirements: 

(a) Each player must receive at least as much as he can get for 
himself when all other players are arrayed against him. If this condi- 
tion is not met, any player who receives the short end of the payoff 
allocation can refuse to go along with the coalition structure from 
which these payoffs result. Obviously, he can always hold out for at 
least the amount which he can obtain for himself without anyone’s 
help. 

To) A second requirement for an imputation is that the total of all 
the payoffs to all of the players combined equals the maximum amount 
they can get by forming one grand universal coalition in which every 
member is included.'? This second condition of group rationality is, in 
fact, widely violated in practice. When farmers fail to get together to 
restrict their total outputs, they end up with a total “take” which is 
lower than the maximum (monopoly) amount. The same is true when 
several countries adopt restrictive tariff policies and all of them end up 
poorer as a result. In other words, an imputation may be described as 
a set of payoffs which could be achieved by the players in a game if 
they were more rational than they are in reality. 


4. The core. Some imputations may satisfy a condition of group ra- 
tionality which is even stronger than (b) above. This more stringent con- 
dition requires that any set of individuals jointly earn from the proposed 
imputation at least as much as they can obtain by getting together and 
forming themselves into a coalition. Of course, people do not, in fact, think 
out every possible coalition they can conceivably form and what they may 
hope to earn by joining it, so that this condition is certainly not met in 
practice. The set of all possible imputations which meets this difficult 


1? This implies that any imputation must be Pareto optimal (cf. Chapter 21, Section 3, 
and Chapter 23, Section 5), for otherwise it would be possible to make a change which 
was advantageous to some of the players and disadvantageous to no one, i.e., the group 
would initially not have obtained the maximum “take” as rationality condition (b) re- 
quires. It also follows that a Neumann-Morgenstern n-person game solution (defined 
below) must be Pareto optimal. 
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requirement is called the core of a game. However, many games have no 
core, Indeed, it has been proved that any zero-sum game which has a core, 
i.e., for which such imputations exist, must be inessential! 

5. Characteristic function. But how does one determine how much will 
be paid to each player in different circumstances? Given the coalitions 
which are formed, what payoffs will be received by the members of each 
coalition? Von Neumann and Morgenstern approached this question by 
describing lower limits to these amounts. Given any coalition C, the worst 
that can possibly happen, from its point of view, is that all other players 
will combine against it in one grand countercoalition. But if the total take 
of the two coalitions is fized at its maximum possible value [group ration- 
ality requirement (b) of an imputation] this transforms the problem into a 
constant-sum, two-person (two-coalition) game. Since such a game can 
always be solved by the linear programming methods described in Section 
8, above, we can caleulate how much will be earned by our coalition C in 
these unfavorable circumstances. 

In this way a minimum-earnings figure can be computed for every 
possible coalition. If payoffs are measured in utility terms, the relationship 
R = v(S), which gives this minimum payoff, E, for every possible coali- 
tion S, is called the characteristic function of the game. The Neumann- 
Morgenstern analysis of the n-person game is based largely on the charac- 
teristic function. However, such an analysis is bound to leave out much 
relevant information about the game because it concentrates exclusively 
on the worst outcomes for each coalition. 

The following plausible result is among the theorems on characteristic 
functions: If two coalitions combine, the value of the characteristic func- 
tion for the combination will equal or exceed the sum of the values for the 
uncombined coalitions, i.e., the combined coalition will earn at least as 
much when the rest of the world is against it as the two subcoalitions can 
earn for themselves in similar circumstances. 

6. Domination. An imputation, I, is said to dominate another imputa- 
tion, J, if there exists at least one coalition C which can be sure (in terms 
of the characteristic function) of earning for its members an amount, v(C), 
which is at least as large as that prescribed for them by imputation J and 
if, in addition, every member of C receives more from imputation 7 than 
from imputation J. In other words, J is dominated by J if a set of players 
who are in a position to prevent imputation J from supplanting J find it 
profitable to prevent J. It is, of course, possible for two imputations to 
dominate one another if coalition S prefers J to J and can prevent J, and 
if coalition T prefers J to I and is in a position to prevent 7. 

7. Solution. Von Neumann and Morgenstern define a solution of 
an n-person game as a set of imputations which has the following 
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characteristics: , 

(a) If J and J are any two of the imputations in a solution, then 
neither J dominates J nor J dominates I. 

(b) If K is an imputation which is not included in the solution set, 
then there is at least one imputation, K*, which dominates K and 
which is included in the solution. Thus, a solution consists of a set of 
imputations none of which dominates any other, and which can among 
them dominate any excluded imputation. 


There are several difficulties involved in this concept. First of all, a 
solution usually includes a number—sometimes an infinite number—of 
possible imputations. That is, a solution only lists for us a number of pos- 
sible outcomes to a game. Thus it does not usually tell us how the game 
will or should end up. It only confronts us with a list, and sometimes a 
very large list, of possible alternatives. 

This situation is even worse than this suggests, because a game may, 
and often does, possess a number of alternative solutions (each with’ its 
multiplicity of imputations). It is clear, then, that the solution concept 
does not permit us to calculate any unique outcome for the general game. 

Moreover, although it is known that the number of solutions in the 
three-person game is usually embarrassingly large, it is not known whether 
there are games which do not possess even a single solution even where 
the number of players is restricted to a number as low as five. Shapley has 
also shown that there are games for which the solutions constitute strange 
and unpredictable sets, that is, cases which make it very difficult to set up 
general rules about the nature of solution sets.!? 

Because the solution concept permits so much indeterminacy and is 
not fully satisfactory in other respects, a number of alternative concepts 
have been explored. Milnor has set up several sets of criteria which, he 
suggests, an imputation should meet in order to be considered reasonable. 
These criteria are designed. primarily to get rid of some of the possible 
imputations on the ground that they are in some sense not “reasonable.” 
Vickrey has proposed a concept which he calls a strong solution, consisting 
only of imputations and coalitions such that if anyone defects from one of 
the included coalitions he is apt to regret, it because there exist-alternative 
imputations which tempt his new partners to “double-cross” him in turn. 
Luce has constructed a theory which takes into account the fact that there 
are institutional constraints on the formation and breakup of coalitions. 


13 The solution concept has also been criticized for its reliance on the characteristic 
function. That is, in practice, one imputation, J, may in effect "dominate" another, J, 
even though the characteristic function indicates that no coalition C can prevent J 
profitably. For the characteristic function gives only the most conservative estimate of 


peat "i can hope to achieve, and in practice C may often be expected to do much better 
an that. 
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That is, it recognizes that social mores may well prevent the formation of 
certain types of coalition. This points up what is admittedly the main 
weakness of game theory in its present stage of development—the relative 
lack of specific sociological, psychological, and economic content in its 
premises. Until such material is supplied, it is unreasonable to expect the 
mathematics to yield the empirically applicable results which are not con- 
tained in its assumptions. 

This concludes a rather disjointed discussion of concepts of n-person 
theory, which should nevertheless at least offer the reader a hint of its 
flavor. At any rate this section should suggest both the strength and 
weaknesses of game theory from the points of view of the economist and 
the operations researcher—its weakness as a source of devices for the cal- . 
culation of categorical answers to competitive problems, and its strength 
as a suggestive frame of reference within which the structure of these 
problems and the alternatives available to the decision-maker may be 
seen more clearly. 
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Decision Theory” 


I9 


1. The Subject Matter of Decision Theory 


Contemporary theory follows Knight’s distinction between risk and 
uncertainty.! Risk refers to situations in which the outcome is not certain, 
but where the probabilities of the alternative outcomes are known, or can 
at least be estimated. Uncertainty is present where the unknown outcomes 
cannot even be predicted in probabilistic terms, that is, it refers to con- 
tingencies against which one cannot protect oneself on ordinary insurance 
principles. 

In game theory, choice problems which involve risk are analyzed with 
the aid of utility theory, as we have seen in the last two chapters. One 
makes that decision whose expected utility (the average utility of the 
alternative outcomes each weighted by its probability of occurring) is 
highest. Decision theory has been developed to deal with problems of choice 
or decision-making under uncertainty, where the probability figures re- 
quired for the utility calculus are not available. 


* As in the last chapter, the reader who wishes to learn more about the theory, or 
who desires further references, is advised to consult R. Duncan Luce and Howard Raiffa, 
Games and Decisions, John Wiley & Sons, Inc., New York, 1957, especially Chapter 13. 

1 F. H. Knight, Risk, Uncertainty and Profit, Houghton Mifflin Company, Boston, 
1921. Reprinted by the London School of Economics, series of reprints of scarce tracts in 
Economics No. 16, 1933. 
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Quite a bit of the games apparatus has been carried over into decision 
theory. As will soon be shown, the payoff matrix, the strategy concept, 
and the minimax approach all make their appearance again. But there is 
one fundamental difference between the problems of game and decision 
theory which cannot be overemphasized. In game theory, at least in the 
zero-sum, two-person case, there is a major element of predictability in 
the behavior of the second player. He is out to do everything he can to 
oppose the first player. If he knows any way to reduce. the first player’s 
payoff, he can be counted upon to employ it. In decision theory the second 
player is not even, strictly speaking,.an opponent. Often this second 
player is referred to as nature and the corresponding decision problems are 
called games against nature. But our player cannot count upon nature to 
oppose him. In fact he cannot count on nature to do anything in particular. 

This chapter follows the bulk of the decision-theory literature by 
treating only the so-called complete-ignorance case, that is, the case where 
the player who is to make a decision has absolutely no clue as to what the 
other player is going to do. Once there is available any information about 
his rival’s likely behavior, however fragmentary, the requirements of the 
complete-ignorance case are violated. 

However, the standard complete-ignorance analysis supposes, at least 
implicitly, that the player has at his disposal a large amount of other 
types of information—more, in fact, than a relatively well-informed busi- 
nessman is likely to have in practice. In assuming that he can describe 
his problem in terms of a payoff matrix, the player is taken to possess a 
list of the strategy alternatives which are open to himself as well as those 
which are available to his opponent. In addition, he is assumed to know 
the magnitudes of all of the elements in the payoff matrix. This means, for 
example, that if a businessman player adopts a particular inventory policy 
and the demand for his product turns out to follow some particular time 
pattern, say falling at first, subsequently rising sharply, and finally, leveling 
off (this time pattern is considered “‘nature’s strategy choice"), then the 
businessman knows, or believes he knows, exactly what payoff he will 
receive as a result of this (and every other) pair of his and nature’s strategy 
choices. This is the sort of information which is conveyed by the numbers 
in his payoff matrix. - 

The player must also be recognized to have at his disposal a very differ- 
ent kind of highly pertinent information, for he knows something about 
himself —his own financial position and his attitude toward taking chances. 
Together, these must determine to what extent he will desire and can 
afford to gamble. The validity of any rules for rational decision-making 
under uncertainty must, then, be contingent upon at least these two ele- 
ments—the player's psychological makeup and his pecuniary circumstances. 
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Since attitudes toward gambling and financial circumstances differ 
from person to person, it.is clear that there can be no one universally 
valid rule which tells a player how to choose among the strategies that 
are open to him. The appropriate decision criterion must vary from person 
to person and from one situation to another.” 

It is not surprising, therefore, that a considerable number of alternative 
decision rules have been proposed. At present, the bulk of the literature 
of decision theory relates to such decision-rule proposals. Let us now 
examine in turn the most frequently discussed of these decision criteria. 

1. The mazimin criterion. As in game theory, one of the most conserva- 
tive of decision rules is the maximin criterion. For each possible strategy 
the player determines the worst that can possibly happen, and then picks 
the strategy which is “least worst,” i.e., whose most unattractive contin- 
gency is least disastrous. 

In the present context the maximin strategy is somewhat less attractive 
than it is in a games situation, where the player has an active opponent 
whose interests are in direct conflict with his own. In such circumstances 
there can be good reason for fearing the worst. But where one’s opponent 
is nature, who, at least in calmer moments, cannot be considered a sys- 
tematic and calculating opponent, the maximin approach is rather clearly a 
manifestation of pure cowardice. This is not meant to imply that cowardice 
is necessarily irrational. On the contrary, there is much to be said for the 
Falstaffian position on self-preservation. There are persons and situations 
where the maximin strategy is entirely appropriate, but its well to recognize 
the criterion for what it is. 

As an illustration, consider the following payoff matrix (which is 
carefully chosen to make the maximin criterion show up badly): 


[^ D E 


If our player employs strategy A, his worst payoff is one (1) (which he 
receives if nature employs strategy E), whereas if he employs strategy B 


2 It must be made clear, however, that this relativistic view is my own, and it is not 
a standard feature of the writings.on decision theory. But cf. the Hurwicz a criterion 
described below. 
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his lowest possible payoff is zero. Hence his maximin strategy is A, because 
it offers him the larger of the two minimal payoffs. 

This table also illustrates an objection which has been raised against 
the maximin rule whose conservatism will now be shown, in some circum- 
stances, to be somewhat specious. From the point of view of the rational 
conservative, there is much to be said in favor of strategy B, because its 
highest and lowest payoff (in case nature employs strategies C or E) are 
fairly close to those of A, whereas B's intermediate payoff, 98 (if nature's 
strategy is D), is much higher than A's intermediate payoff, 2. Hence B 
appears to offer an excellent hedge against the possibility that neither the 
best nor the worst possible outcome will be realized. 

The source of the difficulty is that the maximin criterion disregards 
most of the information in the payoff matrix. It considers only the worst 
possibility in each row, and makes its recommendation with complete 
disregard for the values of the other elements. Hence it is always possible 
to find cases in which the nature of these other numbers in the payoff 
matrix casts doubt upon the wisdom of the maximin choice. As will be seen 
presently, a similar criticism applies to most, but not all, of the other 
decision criteria which have been proposed. 

2. The mazimaz criterion. A second decision criterion, which does not 
seem to have been put forth seriously anywhere in the literature, is worth 
describing because it is at the very opposite end of the scale of venture- 
someness from the maximin rule. The maximax criterion, which is a deci- 
sion rule well suited to the temperament of a plunger, considers only the 
most glittering prize offered by any strategy and is blind to any other 
contingencies. It calls for the player always to choose that gamble whose 
first prize is highest, no matter what the dangers in the relative values of 
the other prizes and penalties. 

In terms of our payoff matrix, it is clear that the maximax criterion 
advises the decision-maker to employ strategy A, whose highest payoff, 
100, exceeds the 99 first prize of strategy B. Two observations are relevant: 

(a) This illustration shows that the extremely gambling-oriented maximax 
rule can sometimes recommend the same course of action as the maximin 
rule, the counsel of timidity—their advice will coincide when one strategy 
carries with it the best of both first and booby prizes; (b) like the maximin 
criterion, the maximax rule ignores all intermediate prizes and so may 
suggest to a player that he give up a very great advantage in the less 
glittering payoffs, for a negligible difference in the highest prize. 

3. The Hurwicz a criterion. As a (reportedly somewhat tongue-in-cheek) 
compromise, Hurwicz has proposed that a weighted average of the mini- 
mum and maximum payoffs of each strategy be employed as a decision 
criterion. For example, if we weight the minimum payoff (security value) 
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of any strategy at a = 4, and the maximum payoff 1, then the Hurwicz « 
criterion would evaluate strategy A in the payoff table at 


1- 3 +100 - } = 253 
and strategy B at 
0-3 + 99-3 = 243. 


Hence, it would again select strategy A. This is to be expected, since both 
the maximin and maximax criteria selected A, and the Hurwicz criterion 
is, in effect, a weighted average of the two in which the weights are designed 
to reflect the player’s psychology. Like the other two, the Hurwicz criterion 
clearly ignores a strategy’s less extreme payoffs in its computations. 

4. The Bayes (Laplace) criterion. A criterion whose history is far older 
then the others that have been described is the Bayes; or equiprobability- 
of-the-unknown criterion. This states that if we have absolutely no in- 
formation about the relative probabilities of nature’s strategies A, B, and 
C, we must assign equal probabilities to them in our calculations and then 
adopt the strategy whose expected payoff is highest. 

In our payoff table, this criterion evaluates A at 


4-10 +4. 244-1 = 344 
while B is rated at 


3- 994+4-98+4-0 = 653. 


Unlike the others which have been examined, then, the Bayes criterion 
ranks B ahead of A. It does so because the Bayes rule is the only one of 
the criteria so far examined which takes all possible payoffs into account. 
For the first time the 2 payoff possibility of strategy A and the corre- 
sponding 98 payoff of strategy B have entered the calculations, and these 
have turned the tide in favor of B. 

In this respect, then, the Bayes criterion is more appealing than the 
others. 

However, the Bayes criterion does suffer from a serious limitation. 
The difficulty is that it is not clear in advance what unknown possibilities 
are to be considered equally probable. To illustrate this point, let us con- 
sider an economic situation which can lead to a payoff matrix like ours. 
Suppose our player is considering whether to sell ice cream (strategy A) 
or hot dogs (strategy B) at a baseball game. We may divide nature’s 
strategies into the three possibilities, C: sunshine, D: cloudiness, and E: 
rain (or other forms of precipitation). In the complete absence of meteor- 
ological information we might consider C, D, and E to be equally probable 
and assign them each the probability 3 as was just done. 
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Alternatively, however, we might have decided that the major con- 
tingencies to consider are rain vs. nonrain. Because we possess no relevant 
information, it can be argued just as persuasively as before that these two 
contingencies are equally likely and each should be assigned the prob- 
ability 3. We see, then, that by a simple act of reclassification, the a priori 
probability assigned to the rain contingency (nature’s strategy E) has 
been raised from } to 3! In other words, unless we have some advance 
information on the number of categories into which the alternatives should 
be classified, the Bayes equiprobability-of-the-unknown approach can leave 
the relevant probability figure completely ambiguous. By breaking the 
alternatives down into enough different categories, we can assign any one 
strategy a probability as low as we like. 

A variant on the Bayes procedure is to ask the decision-maker to assign 
subjective probabilities to nature’s possible strategies. If for some intuitive 
reason he feels that C, D, and E may reasonably be assigned probabilities 
of $5, 55, and je, respectively, the expected value computation can be 
repeated using these figures instead of the 3, 3, and $ probabilities of an 
equiprobability-of-the-unknown calculation. With these figures, strategy 
A, for example, would be evaluated at 


100-35; +2-¥ +1- i5 = 213. 


5. The minimax regret criterion. The last criterion to be discussed was 
proposed by Savage. His rule concentrates on the opportunity cost of an 
incorrect decision. The approach is to protect the player against excessive 
cost of mistakes. From the original payoff matrix, a second matrix showing 
the cost of mistakes (the regret) is calculated. For this purpose it is neces- 
sary that the elements of the original payoff matrix be expressed in utility 
terms. Suppose that this is true of the data in our illustrative matrix and 
that nature’s strategy turns out to be D. If the player employs his strategy 
B, he obtains the maximum payoff against D (98 as against 2 utils) and 
so he has nothing to regret. The corresponding regret figure is, conse- 
quently, zero, which is entered in the regret matrix at the juncture of 
row B and column D as shown. But if our player had instead employed A 


Regret Matrix 
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against nature’s D, he would have earned only 2 utils as compared with 
the maximum possible payoff of 98, so that his net loss, his degree of 
regret, is 98 — 2 = 96. This is entered in the row A, column D space. 
The rest of the regret matrix is computed similarly from the original 
payoff matrix, as the reader should check for himself. To protect himself 
against excessive loss, the player may now apply a minimax rule to this 
matrix. In each row the maximum loss is starred, and that strategy is 
chosen whose row contains the smallest of the maximum regret elements. 
In this case, strategy B is recommended by a minimax regret rule, because 
the worst that can happen to the player who chooses B is a 1-util regret. 

At first glance it may appear that this criterion, because it recommends 
strategy B rather than A, overcomes the problem that was raised in con- 
nection with the maximin, maximax, and Hurwicz criteria. It is true that 
the minimax regret criterion can take into account large disparities in 
intermediate payoffs. In our illustration, nature’s strategy D, with its 
prohibitive 96-util regret figure, assumes a crucial role even though it 
offers neither the maximum nor the minimum payoff for any of our players’ 
strategies (see the original payoff matrix). However, the minimax regret 
criterion runs into the same problem in a slightly different manner. Since 
it is a minimax criterion, it considers only the largest regret figure in any row, 
and ignores any other data. Hence low and intermediate regret numbers 
are disregarded. If one strategy, F, has a very slightly smaller highest 
regret figure than another strategy, G, the criterion will recommend F 
even if every other regret figure in G is much lower than the corresponding 
number in F. 

The appropriateness of the measure of regret which is employed by 
the criterion has also been called into question. It is not clear from the 
Neumann-Morgenstern utility index that the difference between the 
utilities of two payoffs is a good measure of the player's regret when he 
receives the smaller of the two. Perhaps the regret measure can be con- 
sidered a somewhat crude and arbitrary, though not a totally unreasonable 
measure of the player's loss in choosing the wrong criterion. 


6. Mixed strategies. Rather than choosing a pure strategy directly, the 
decision-maker may prefer to let a random device—a coin or a spinner— 
make his choice for him. As in the game-theory case, such a decision is 
called a mixed strategy. As before, the player can, by using the utility 
calculus, compute the expected utility yield of any mixed strategy corre- 
sponding to any one of nature’s strategies. For example, if the player 
employs that mixed strategy which involves 50-50 odds of choosing either 
strategy A or B, then if nature plays its strategy C, the expected payoff 
will be 

4 the A payoff + } the B payo = 3100 + 399 = 99.5. 
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Continuing the calculation, one obtains the following augmented payoff 
matrix: 


The last row indicates the alternative expected payoffs of the mixed 
strategy [4:A, 3: B] which represents a 50 per cent chance of either A or B. 

By varying these odds, e.g., to three to one, etc., these expected pay- 
offs of the mixed strategy are, of course, changed. It then becomes possible 
to calculate at which odds the mized strategy will yield the largest maximin 
value, or, if we prefer, the odds which give the lowest minimax regret 
figure.? Such a set of odds is said to constitute an optimal mixed strategy. 
The decision-maker may then prefer to employ an optimal mixed strategy. 

Some comment on the concept of the mixed strategy is called for at 
this point. It may seem rather irrational for the decision-maker to permit a 
coin to make up his mind for him, and few if any businessmen are pre- 
pared to adopt this as à standard decision-making procedure. However, 
the maximiner is a fundamentally timid man who fears that his opponent 
(whether it be nature or another player) will always outguess and outplay 
him. This is the logic behind his disregard of anything but the least favor- 


3 Mixed strategies are not helpful to the player who employs a maximax or a Bayes 
criterion. Since the expected utility figure is a weighted average of the pure strategies, 
it will give something intermediate between the highest and the lowest of the items 
being averaged. This process of averaging therefore tends to raise lowest figures (it 
increases the security level) of a pure strategy, and is therefore useful to the maximiner. 
However, because it is an average, the expected values will tend to fall short of the 
maximum figures, i.e., a mixed strategy will tend to reduce the expected return of a 


maximaxer. 
The user of a Bayes criterion never benefits from a mixed strategy for a similar but 


somewhat more complex reason. On a Bayes criterion, both strategies A and B are 
themselves evaluated by means of a weighted average of their payoffs. Thus, as we have 
seen with our payoff matrix, strategy A is evaluated at 34} and B at 652. A mixed 
strategy will in turn be evaluated, on the Bayes criterion, at a weighted average of these 
two figures (their expected value). This average must be less than the 653 value of B, 
80 that the pure strategy B will always be preferred to the mixture of A and B. Thus, 
2 Bayes calculation always evaluates a mixed strategy at some figure intermediate be- 
tween the highest and lowest values of the pure strategies, and so one of the pure 
strategies will always be preferred on this criterion. 
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able outcome for any pure-strategy choice. A mixed-strategy calculation, 
however, no longer disregards these more favorable payoffs—rather, it 
deals with their average or expected value. This can be rationalized by 
the argument that when the player’s decisions are made by a coin or some 
other random device he can be sure that no one will outguess him and so 
those more favorable payoffs which he formerly left out of his calculations 
now become very real possibilities, and they must therefore appear in his 
mixed-strategy calculations. Thus, the higher security value of a mixed 
strategy must be treated as a subtle reflection of the fact that a mixed 
strategy can prevent the player’s opponent from predicting his decision. 


PROBLEM 


In the following payoff matrix (constructed by John Milnor) show that stra 
strategy B will be selected by the maximin 
3), and D by a minimax regret criterion: 


A will be chosen by a Bayes criterion, 
criterion, C by the Hurwics a (fora « 


S à & fp 


First, it is necessary to desc 
which is involved. To keep the diagram 


Which offers exactly two possible 
payoffs. That is, the general pure strategy i i p 
payoff which will be designated by V if 
some other payoff, W, if it employs its 


such strategies are shown in the following payoff matrix, which will provida 
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our illustrations throughout this section: 
c D 


A 


B 


Here pure strategy A has two possible payoffs, V = 6 and W = 3; for 
pure strategy B we have V = 2and W = 8. 

Given the payoffs, a point representing such a strategy can be plotted 
at once on a diagram which measures off the magnitude of payoff V on its 
horizontal axis and that of W on its vertical axis. Thus strategies A and 
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Figure 1 


B are represented in this way in Figure la. Any point in the diagram clearly 
represents a pair of payoffs and, therefore, some (hypothetical) Strategy, 
and, conversely, every possible Strategy with a pair of alternative payoffs 
can be represented by such a point. If V and W are interpreted as expected 
payoffs, such a point can also represent a mixed strategy whose expected pay- 
offs are the coordinates of the point. We may familiarize ourselves with 
the nature of the representation by noting the following: 


(we say then that E strongly dominates A), point E will lie above and to 
the right of A. 

2. If and only if one strategy, E, has payofis no smaller than those of 
another, B, and if just one of E's payoffs is higher than the corresponding 


payoff of B (E weakly dominates B), point E will lie either directly above 
or directly to the right of point B. 
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3. If some strategy point lies directly on the 45-degree line through 
the origin (point T in Figure 1b), the corresponding strategy will involve 
no uncertainty, since the two alternative payoffs will be equal. Thus, T 
has a payoff of 4 dollars if nature employs strategy C and (also) 4 dollars in 
the alternative event that nature plays D. At any point, such as A, which 
is not on the 45-degree line, the two possible payoffs V and W are unequal 
and we say that there is dispersion in the payoffs. 


Suppose now that the player’s preferences among these possible strategy 
points can be described by a set of indifference curves drawn through the 
diagram (Figure 1b). This is a rather strong premise since it means that 
the player must be able to rank every such possible strategy. However, 
that is precisely what is done by any one of the criteria which have been 
described. In any event, it does not seem much less plausible than 
the corresponding assumption behind the ordinary indifference map 
construction. 

The only difference between this and the ordinary indifference map 
@onstruction is that in the usual case the payoffs V and W represented 
on the axes are received by the player together (at point K he receives 
V, plus W) whereas in our diagram the payoffs are alternatives (he 
receives either V, or W, but not both). 

The shapes of the players’ pure-strategy indifference curves will vary 
with their attitudes toward uncertainty. For example, we might expect 
that the indifference curves of a person who has an aversion to gambling 
will be convex to the origin like those in Figure 1b (or in the extreme case, 
like those of Figure 2a). To see why, we note that this shape (a diminishing 
marginal rate of substitution of W for V) means that successive equal in- 
crements in one of the payoffs will compensate the player only for ever- 
smaller reductions in the other payoff. He considers one increasingly 
glittering prize to be poor compensation for a continued proportionate 
deterioration in the alternative payoff.* 

Similarly, the gambler who is anxious to give up the protection of a 
fairly good, second-best payoff in return for a more glittering first prize 


* The connection between convexity to the origin and desire for a low dispersion in 
payoffs can be shown somewhat more rigorously as follows. For any probability num- 
bers q and 1 — q, the straight line qV + (1 — g)W = K (FF’ in Figure 1b) is the locus 
of combinations of payoffs, V and W, all of which yield the same expected payoff, K. 
If the player is averse to gambling, he would therefore presumably prefer point T' on the 
45-degree line, where the dispersion in the payoffs is zero, to either points A or Y on 
line FF’ since all three points represent pairs of payoffs whose expected values are the 
same. Choose Y to be a point which is indifferent to A. If both points are indifferent, 
also to some point on the 45-degree line, it must be a point which is inferior to T, i.e., 
Y and A must be indifferent to a point such as X nearer to the origin (i.e., of lower pay- 
offs) than 7’. Thus the indifference curve Y XA must be convex to the origin. 
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may be expected to have pure-strategy indifference turves which are 
concave to the origin. 

It will now be shown that the maximin, the maximax, the Hurwicz, 
and the Bayes criteria each require that the decision-maker’s pure-strategy 
indifference curves be of a very special shape which varies from criterion 
to criterion. For example, in the maximin ranking of a strategy, only the 
smaller of its payoffs is taken into account. This means (Figure 2a) that 
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Figure 2 


below the 45-degree line where payoff W (the ordinate of any point) is 
the smaller of the two coordinates, an indifference curve is any horizontal 
line W = constant, because the player will never be indifferent between 
two strategies for which the smaller payoffs, their W’s, are not the same. 
For the same reason, above the 45-degree line the indifference curves must 
be the vertical lines V = constant. In other words, only for an individual 
whose pure-strategy indifference curves are like those in Figure 2a will it 
be appropriate to use the maximin criterion. 
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Similarly, since the Bayes criterion evaluates strategies at 34V + AW, 
the equation of a Bayes indifference curve is 4V + 4W = K (constant), 
ie. W = —V + 2K. These curves are the parallel straight lines of slope 
—1 which are shown in Figure 2b. The use of other subjective probability 
numbers instead of fifty-fifty odds only changes the slope of these parallel 
straight lines. The reader can readily show that a maximax criterion 
(according to which the larger of the numbers V or W is constant along 
an indifference curve) and a Hurwicz e criterion (a multiplied by the 
smaller of the numbers V and W, plus 1 — o multiplied by the larger of 
these two numbers = a constant) require indifference maps of the kinds 
shown in Figures 2c and 2d, respectively. The reader should note that 
strategy A is on the higher indifference curve in Figure 2a, that B is the 
preferred move in 2b and in 2c, and that in 2d they are indifferent. 

Since there is no reason to believe that every person’s indifference map 
will assume (the same) one or even any of the forms in these figures, it 
follows that no one of these criteria is a universal prescription for rationality. 
Strategies can be ranked, but only by the criterion which happens to be 
appropriate to the particular decision-maker in light of his psychological 
and financial circumstances as reflected in the shape of his indifference 
map. 


4. Axiomatization 


An alternative approach to the decision problem has employed what is 
called the axiomatic method. By setting up as axioms a number of require- 
ments for an acceptable decision criterion, several authors have been able 
to come up with unique decision rules, e.g., several writers have shown that 
the Bayes criterion is the only one which satisfies the sets of axioms which 
they have proposed. Before we go into further detail, a few preliminary 
words on the axiomatic method are appropriate. 

Axiomatization is one of the mathematician’s very powerful and 
fruitful methods. In using it, the analyst sets out in explicit mathematical 
form the assumptions which he is willing to use in his investigation. He 
then employs rigorous mathematical techniques to deduce from these 
axioms as many of their implications as he can. Often the derived theorems 


5 The convexity to the origin of the indifference curves of the extremely conservative 
maximining player, and the concavity of the extreme gambling maximaxer’s curves are 
in line with the interpretations of convexity and concavity given above. 

The Savage minimax regret criterion cannot be represented in 80 simple a diagram 
since it involves direct comparison of all the elements in several strategies 80 that more 
than two dimensions are required for the indifference map even where each strategy 
has only two possible payoffs. However, if we deal with the regret matrix rather than 
the payoff matrix, the indifference map.is again that shown in Figure 2a, since Savage 


applies a minimax (maximin) criterion to the regret data. 
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are extremely surprising and bear little obvious relationship to the axioms 
from which they are deduced. The axiomatic method, then, has two very 
attractive features. It forces the analyst to set his assumptions out ex- 
plicitly, and it puts him in a position to deduce rigorously the implications 
both obvious and obscure of his a priori notions about the problem as 
expressed in these axioms. 

However, it must be recognized that while mathematical statements 
are always explicit, they are often not transparent. The literature abounds 
with axioms whose meaning is in dispute or which turn out to mean some- 
thing other than what their author intended. It is true that a mathematical 
axiom must have everything there—the author cannot simply hint at 
some of its features and keep reservations in back of his head in a fuzzy 
statement, as he is able to do in a literary discussion. But if the axiom 
requires a complex mathematical formulation (though simplicity, too, 
can sometimes be deceptive), the more subtle nuances of its meaning may 
be obvious neither to the analyst nor to his audience. There are a number 
of innocuous-sounding premises in the literature whose critical implications 
belie their apparent innocence. None of this is meant as a criticism of the 
axiomatic method. It amounts only to the trite injunction that powerful 
weapons should be used with very great caution. 

The axiom systems which have been employed in the decision-theory 
literature are fairly complex and abstract, and there will be no attempt to 
describe any of them here. Implicitly, such a system must specify some- 
thing about the needs and desires of the decision-maker if it is to be used 
to derive some specific decision rule. 

The next section describes such a derivation. It is selected for its sim- 
plicity and it is unfortunate that it is not one of the standard axiomatic 
treatments of the literature of decision theory, all of which are too difficult 
for our expository purposes. Like a number of the standard analyses, this 
illustrative axiomatization will be shown to rule out all decision rules ex- 
cept the Bayes criterion. However, it does not follow that this is true of 
any axiom system. Indeed, Milnor has described a set of axioms corre- 
sponding to each of the decision rules which this chapter has described. 


5. Neumann-Morgenstern Utility and the Bayes Criterion? 


It will be shown in this section that a simple extension of the Neumann- 
Morgenstern utility assumptions rules out anything but the Bayes criterion. 
That is, a person whose psychology is as described by these axioms must, 
if he is consistent, employ a Bayes decision rule (with subjective a priori 
probabilities assigned by him to nature's Strategies, if he prefers). 


ê This section is somewhat more difficult than the preceding portions of the chapter. 


472 Decision Theory Chapter 19 


First it is necessary to discuss the applicability of the utility axioms to 
the decision problem. Consider a player who is trying to make up his 
mind between one of two equally priced refrigerators. The theory is willing 
to assign utility numbers to these objects and to assume that the player 
will choose that refrigerator whose utility is highest. But suppose, e.g., 
that refrigerator A is better adapted to storing tall objects and that B is 
designed primarily for heavy items. Since the consumer cannot be entirely 
certain in advance what he will be buying over the lifetime of the refrig- 
erator, the choice between A and B must represent strategy decision 
against an uncertain future. Similarly, the acquisition of any other durable 
item, such as a factory, can also be interpreted as a strategy choice. 

Generalizing from this we may interpret two strategies A and B ina 
payoff matrix as two refrigerators, or two factories, or two tickets to a 
game with fixed prizes but unknown odds. The player may be certain of 
possessing ticket A and hence he may evaluate the utility of the ticket 
just as he does that of a refrigerator. Moreover, it is possible to assume 
that the player’s ranking of these strategies satisfies the Neumann-Morgen- 
stern utility axioms. Certainly, casual inspection of the axioms suggests 
that they are no less persuasive than usual when applied to refrigerators, 
to factories, or to any other tickets of admission to a game involving un- 
certainty rather than risk. Thus we adopt for our illustrative purposes 


Axiom 1. The player’s ranking of strategies satisfies the Neumann- 
Morgenstern utility axioms. 


In addition, it is necessary for our purposes to specify explicitly an 
essential feature of a game against nature—the fact that nature is not a 
calculating opponent so that the decision-maker has nothing to gain by 
camouflaging his strategy intention. Thus 

Axiom 2. The utility of a strategy is dependent only on the payoffs 
and probabilities which it involves. (In particular, its utility is not affected 
by an attempt to conceal from a competitor the fact that one has decided 
to play it.) 

These two axioms together permit us to make a standard calculation 


O Vr-ü-d)s Ms-ps E V 
Figure 3 
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of the utility of a mixed strategy. For if M is the mixed strategy which 
chooses A with odds $ and B with odds $, we have, by the usual rules: 
the utility of M = 2x utility of A +4 x utility of B; or, in symbolic 
notation? (with generalized odds g and 1 — q instead of 2 and $) 


U(M) = qU(A) + (1 — 4)U(B). 
Finally we adopt 


Axiom 3. The decision-maker is indifferent between any two strategies 
whose payoffs or expected payoffs are identical. 


The theorem about the Bayes rule can now be derived geometrically. 

In Figure 3 consider two strategies R and S which are represented by 
points lying on the W and V axes, respectively, so that for the former 
strategy we have the payoffs Ve = 0 and Ws = length OR = y (some 
number), etc.; thus the payoff matrix is 


Any strategy point T which lies on the straight line connecting R and 
S divides that line into some proportions which we designate 1 — q and g. 


payoffs V = gr and W — (1 — g)s (see figure) are the same as those of 
T. We can now proceed with our proof in two parts, which shows that 
the indifference curves must be straight and parallel as a Bayes criterion 


. Part 1: Let S be chosen so: that it is indifferent with R. Then the in- 
difference curve connecting them is a straight line. 


‘ ion. Strategy A will be 
much more valuable if there ig a fair chance that the enemy will not make the right 


» Say, U*(A), and U(B) to 


UM) = qU*(4) + (1 — AUB) > qU(4) + (1 - QU(B). 
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Proof: By hypothesis, since E and S are indifferent, we have U(S) = 
U(R), ie., the utility of S equals that of R. By the usual Neumann- 
Morgenstern formula, our mixed strategy, M, has the utility 


U(M) = qU(R) + (1 — 9U(S) = qU(S) + (1 - g)U(S) 
(a 4-1 — g)U(S) = U(S) 
U(R). 


Thus the mixed strategy M is indifferent with both S and R. But by 
Axiom 3, pure strategy T is indifferent with M (since they have the same 
expected payoffs). Hence any pure strategy T' on the line RS must be 
indifferent with both strategies R and S. 


Part 2: All of the decision-maker's remaining indifference curves are 
parallel to RS. 


Proof: Consider the pure strategy O both of whose payoffs are zero, 
and which is therefore presented by the origin of the diagram. Form the 
two mixed strategies, Mp and Ms, where Mz is defined as R with any fixed 
probability p and O with probability 1 — p, and where mixed strategy 
Ms is S with probability p and O with probability 1 — p. We have 


U(Mg) = pU(R) + (1 — p)U(O) 


and 
U(Ms) = pU(S) + (1 — p)U(O). 


Since S and R have been chosen to be indifferent, so that U(R) = U(S), 
it follows at once that U(Mr) = U (Ms), i.e., that the two mixed strategies 
are indifferent. Hence by the argument of part 1, the straight line MrMs 
connecting the mixed strategy points is an indifference curve. But (Figure 
3) Ms has coordinates (ps, 0), and Mz has coordinates (0, pr). Hence 
Mza/Ms = pr/ps = r/s, that is, the slopes of the two lines are equal, and 
the straight-line indifference curve MpMs is therefore parallel to RS. 

This proves our theorem because any parallel straight-line indifference 
curves satisfy the Bayes criterion. This has already been indicated in the 
discussion of Figure 2b.° 


8 More rigorously, any one of these lines has an equation of the form W = —kV + c, 
where k and c are any numbers and k > 0. Let r be a number defined by r = k/(1 + k). 
Then we have 0 < r < 1, so that r can be a probability number. Moreover, solving for 
k in terms of r we have k = r/(1 — r). Substituting this expression for k into the equa- 


tion of our line, we get 

W = —[r/q —7)V--c or rV4- (1 — 7r)W = (1 — r)e = a constant, 
which is the equation of a Bayes indifference curve with probability r of payoff V and 
1 — r of payoff W. 
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Thus we have proved the theorem that a simple extension of the 
Neumann-Morgenstern utility axioms requires the rational decision- 
maker to employ a Bayes criterion. It follows, incidentally, that there is 
some conflict between the maximin strategy and these utility axioms. The 
source of this difficulty is the extreme pessimism of the maximin strategy 
user which is not shared by a person to whom the utility axioms are ap- 
plied. The former views the worst possible payoff of any strategy as the 
only possibility worth considering, whereas the utility calculator takes all 
payoffs into account. As already indicated, this difference in outlook can 
be rationalized by arguing that a mixed strategy protects the player from 
being outguessed by his opponent, a possibility which seems doubtful in 
games against nature, and which is therefore ruled out by Axiom 2 of this 
section. 


6. Decision Theory and the Foundations of Statistics 


Before concluding this chapter it is worth indicating briefly how the 
decision analysis has been used to reorient some of the literature on the 
foundations of statistics. 

To illustrate the nature of this application, consider a simple problem 
of statistical quality control. A television tube manufacturer has a sample 
of tubes taken out of each day’s production and tests every tube in the 
sample. Unless too many of the sample tubes are found to be defective, 
the entire day’s production is just packed up and shipped without further 
examination. The statistical problem is how large a sample should be 
chosen for inspection and what proportion of defectives in the sample 
ought to be considered excessive, i.e., what is the proper border line be- 
tween an acceptable and an unacceptable sample. 

In conventional statistical analysis it is customary to make some prob- 
abilistic calculation indicating the degree of assurance provided by differ- 
ent sample sizes and rejection levels that the number of defectives in the 
total output batch will fall short of some specified number. Ultimately, 
the sample design decisions are made more or less arbitrarily, after a check 
that these decisions can be considered reasonable. 

But the statistical problem is really one in which it is meaningful to 
look for an optimal decision. Too small a sample or too liberal a rejection 
level means that the percentage of defectives in the firm’s shipments is 
likely to be high, and this can prove costly both in the cost of servicing 
under the manufacturer’s guarantee, and in the loss of customer good will. 
On the other hand, as many firms have found to their sorrow, excessively 
rigid quality-control standards can be very expensive and can force the 
manufacturer to price his product out of the market. Clearly, some inter- 
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mediate quality-control standards must be optimal, and an analytic ap- 
proach to this problem of statistical design must seek such an optimum. 

This problem is now readily translatable into standard decision-theoretic 
terms. Each relevant sample-size, rejection-level combination may be 
considered a strategy of the quality controller and, for each of these, the 
alternative possible payoffs to the firm may be entered in a payoff matrix. 
The rest of the analysis can then employ the methods of decision theory. 

The same sort of comment applies to statistical problems more general 
than that of commercial quality control. The entire theory of testing of 
hypotheses is subject to the same considerations. The customary use of 
tests conducted at a 95 or a 99 per cent level of significance is essentially 
arbitrary and does not take into explicit account the costs and benefits 
(payoffs) of alternative significance levels. 

These considerations have, as yet, had little influence on the methods 
of applied statistics. Rather, they have affected mostly the relatively 
abstract and philosophical discussions of the foundations of statistics. In 
part, this is because decision theory is still in a rudimentary state and can 
offer no firm and final answers to the questions of statistical design. Among 
the sources of the unsolved problems are the difficulties involved in ob- 
taining data for the payoff matrix and the fact that the results of a decision- 
theoretic calculation depend on whether one chooses to employ a maximin, 
or a Bayes, or some other decision rule, and we have, as yet, no systematic 
procedure for making this choice. However, the discussion has served to 
call attention to the optimality problem—which is fundamental to most 
statistical problems—and to indicate some new and illuminating ways in 
which these problems can be viewed. 
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1. Interdependence in the Economy: Substitutes and Complements 


General equilibrium theory was developed to take account of a cardinal 
feature of the structure of our economy: the interdependence of its parts. 
A rise in the price of automobiles can reduce the demand for tires and in- 
crease the demand for bus transportation. A rise in wages may increase 
imports, reduce exports, and increase the use of labor-saving machinery. 
The set of examples can be expanded indefinitely. 

Two types of interdependence relationship which have received coi 
siderable notice are substitutability and complementarity. Substitute goods 
are items which serve similar purposes, so that the buyer may choose from 
among the set of substitutes which serve his desires. Usually substitutes 
are imperfect so that the buyer will not be indifferent between them—they 
serve somewhat the same purpose, but do so imperfectly. Some examples 
of substitutes are raincoats and umbrellas, chicken and turkey, coal and 
fuel oil (which are also substitute inputs as well as substitute consumers’ 
goods). Complementary goods are items which people (at least sometimes) 
wish to use jointly: cheese and wine, shirts and neckties, needles and thread. 
Note that labor and machinery may be either substitute or complementary 
inputs, depending on the context of the problem. 

Substitute and complementary goods have been defined in terms of the 
effect of a change in the price of one of such a pair of goods on the demand 
for the other. If we omit the income effect, a reduction in the price of one 


479 


480 General Equilibrium and the Theory of Money Chapter 20 


of a pair of substitute items should decrease the demand for the other (a 
fall in the price of leather should reduce the demand for plastic furniture 
coverings) whereas the reverse holds for complementary goods (a reduction 
in the price of television sets may increase the demand for beer and aspirin). 

It follows that many commodities whose relationship is only slight will 
be at least mild substitutes because they are competitors for the consumer’s 
limited stock of purchasing power. A fall in the price of houses can reduce 
attendance at concerts because more houses may be bought and the new 
house owners may not be able to afford as many evenings out after meeting 
their monthly bank payments. 

The upshot of the discussion is that a demand (or a supply) function 
for commodity x should not just include the price of z as its only price 
variable. In fact, to be on the safe side, it is customary in a general equili- 
brium demand function to include every price in the economy as a possi- 
bility, ie., to say that the demand for any item is, at least potentially, 
dependent on the price of every other item in the economy. 


2. Equations of General Equilibrium 


Suppose an economy has 2,053 different commodities, including bonds, 
stocks, and factories as well as ordinary consumer’s goods. Let us treat 
money as another one of these goods—item 2,054 in the list. In accord 
with the discussion of the last section, if hats are item no. 12, the demand 
for hats will be given by an expression 


Qiz = Dis(Pi, P2,- - - , P2054, A, M). 


This states that the number of hats demanded depends on the price of 
every one of the 2,054 commodities, P1, P2, --- , P2954. In addition, demand 
will depend on the wealth of the economy (presumably the wealthier the 
economy, the greater the demands for commodities). The wealth of the 
economy is summed up by the variables A (an index of its holdings of 
physical assets—buildings, farmland, faetories, etc.) and M, the stock of 
cash in existence. It will be noted that there is no explicit income variable 
included in this discussion since consumers’ income is presumably given by 
the prices of the commodities which they sell for a living—e.g., the price 
of labor time (the wage rate)—and these prices already appear in the de- 
mand function. 

There is a similar supply function for item no. 12 (hats), which may be 
expressed as 


Si2(P1, Po, +++, P2054, A, M). 


The economy is said to be in a state of general equilibrium if the supply 
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of every commodity is equal to the demand for it. That is, for the 2,054 
items in the economy, the following 2,054 equations must all be satisfied: 


Si(Pi, Pa, +++, Poss, A, M) = Di(Pi, Pa, ---, Poos, A, M) 
S2(Pi, Pa, +++, Poos, A, M) = D2(Pi, Ps, ++ 7, Poo, A, M): 


Sai P3, Po, +++, Poos, A, M) = Dy(Ps, Ps, +++, Poos, A, M). 


If we are given the values of A and M, we have as many unknown prices 
as equations and the system can therefore, presumably, be solved for the 
equilibrium values of the prices, P1, P5, - - - , P2054. Substitution of these 
values into the demand (or supply) expressions will then indicate the quan- 
tities of the various commodities which will be exchanged. This, in essence, 
is the general equilibrium system and the method by which it determines 
the prices and quantities sold of the various commodities. It has been 
and can be, expanded and complicated in various ways, by including other 
variables explicitly, e.g., exogenous variables (variables whose values are 
determined by noneconomic phenomena) such as temperature, and endog- 
enous variables such as advertising expenditure, both of which clearly 
affect demand. We can also go beyond the supply relationships to take 
explieit account of the behavior of firms and the availability of natural 
resources. But until some recent developments, which will be discussed in 
Chapter 23, the structure of general equilibrium analysis did not differ 
essentially from that just described. 


3. The Redundant Equation: Walras' Law 


There is one complication which arises even in this simple system. It 
requires discussion here because of the ideas to which it leads—and despite 
the fact that it turns out to be far less important than the earlier general 
equilibrium theorists believed. Of the 2,054 prices which have been included 
as variables, the last, the price of money, is a rather peculiar animal. By 
definition, the price of any item, say a hat, is the number of dollars it takes 
to purchase a unit of that good. But the unit of money is a dollar so that 
the number of dollars it takes to purchase a unit of money is exactly one 
(1). It is, therefore, inconceivable that the price of money should be any- 
thing but unity. In other words, Po54, rather than being a variable, must 
be the number “1.” We have thereby lost one of our 2,054 variables though 
we are still apparently left with 2,054 equations. Now, as will be shown in 
Chapter 23, Section 1, having the same number of equations and unknowns 
does not guarantee that the system can be solved, nor is the absence of 
equality in the number of equations and unknowns necessarily fatal to the 
solvability of a simultaneous equation system. Nevertheless, earlier general 
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equilibrium theorists set great store by this equality, so they considered it 
important to prove that one of the 2,054 equations is redundant—that in 
reality we have only 2,053 significant equations to match the 2,053 un- 
knowns. 

For this purpose they discovered an important identity which has 
since come to be called Walras’ law. Any person who demands a commodity 
is, by definition, prepared to supply, in exchange, an amount of money (or 
other commodities) of equal value. Similarly, anyone who supplies some 
amount of goods on the market demands in exchange its value equivalent 
in money or other commodities. 

Every demand is thus matched by an equal supply (in dollar terms) 
of some other items and vice versa. It follows at once that the total money 
value of all items supplied must equal the total money value of all items 
demanded. In algebraic notation, 


2054 2054 
(1) pi» PS; = >= PiD;, 
i=l i=l 


where the three-pronged identity sign, =, means that (at least in an ordi- 
nary economy) this relationship must hold no matter what—whether there 
is equilibrium, or disequilibrium, whether prices are high or low. This 
identity, which is little more than an accounting relationship (it is difficult 
to imagine an economy in which it does not hold), is Walras’ law. 

To show how Walras’ law can be used to indicate that one of the general 
equilibrium equations is redundant, suppose that we find prices which 
satisfy all but one of the supply-demand equations, say every equation 
except the first. The sums of the money values of the supplies of all com- 
modities (excluding the first) must equal the money values of their de- 
mands. Walras’ law then tells us, by subtraction, that the supply and de- 
mand of the first item must then also necessarily be equal.! That is, if PS; 
is unequal to P,D, but the values of all other supplies, P,S;, equal the 
corresponding demand values, the sums of all of these supplies together 
cannot possibly add up to the values of the demands as Walras’ law re- 
quires. It follows that PS, cannot possibly be unequal to P,D,, i.e., if all 
other supplies and demands are equal, we must necessarily also have Sı = 


1 Proof: Since Sz = D2, Ss = Ds,---, S2054 = D2054, we have, multiplying by the 
corresponding prices, P2S2 = P2Dz2, P3S3 = PsDs, etc., and adding these equations 
together we obtain 

2054 2054 


X PiS: = Y] PAD. 
im? im? 
Subtracting this equation from the Walras’ law identity we obtain our result, 


PS, =P Dı or Si = Di. 


Part 4 General Equilibrium and the Theory of Money 483 


Dı. Hence, if we find any prices which satisfy every supply-demand equa- 
tion except the first, we need not bother testing them in the first equation, 
for we know, without trying them, that S, will equal D; at these prices. 
The first equation is then harmless and redundant—it adds no information 
which is not already given by the other equations, and causes no difficulties. 
We can drop the first equation and solve the others for the prices as if the 
omitted equation had never existed. 

It is important to realize that the first supply-demand equation was 
picked for omission purely as a matter of expository convenience. Actually, 
Walras' law permits us to drop any single equation of our choice. Much 
confusion can be saved by realizing that no substantive issue is involved 
in the choice of the equation to be omitted, since whatever information it 
provides about any equilibrium prices and quantities will still be contained 
in the remaining equations. 


4. Pitfalls in Determination of the Price Level? 


Since the general equilibrium equations which have just been described 
presumably determine all prices in the economy in money terms, they also 
determine the level of prices—whether prices in general are high or low— 
inflated or deflated. If the matter is left here, no trouble need arise. How- 
ever, for a long time the economic literature has contained fairly detailed 
and separate discussions of the theory of money and the price level. But 
what they wrote in these discussions often skated perilously close to con- 
tradictions with what had been said in the general equilibrium sections. 
It turns out that there are a number of well-concealed pitfalls in this area, 
in which some writers have, indeed, been caught. 

In order for there to be a determinate price level, there must be one 
price level which is consistent with equilibrium, and all other price levels 
should produce disequilibrium and therefore be untenable. This is, for 
example, what is postulated by the quantity theory of money which states, 
in effect, that the higher the price level, the more cash people will demand in 
order to be able to carry on their day-to-day business. Hence, given the 
suppiy of money, if the price level is very high, the demand for cash will 
exceed the supply. People will hold on to money rather than other assets 
(reduce their demands for goods) and the price level will be forced down 
toward its equilibrium value. Similarly, with & fixed money supply, if 


? The next few sections are based on work of Lange and more particularly on that 
of Patinkin. See Oskar Lange, "Say's Law: A Restatement and Criticism," in Oskar 
Lange, Francis McIntyre, and Theodore O. Yntema (eds.), Studies in Mathematical 
Economics and Econometrics; In Memory of Henry Schultz, Chicago University Press, 
Chicago, 1942; and Don Patinkin, Money, Interest, and Prices, Harper & Row, Pub- 
lishers, Inc., 2nd ed., New York, 1965. 
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prices are below their equilibrium levels, the demand for money will be less 
than the supply; people will try to get rid of money by spending more, and 
prices will be forced to rise. This, then, is the classical mechanism of price- 
level determination. The central point merits repetition: In any theory of 
price-level determination there must be one price level which produces 
equilibrium, and the others must result in disequilibrium, for otherwise 
there will be nothing to select out the price level so that any monetary 
theory must be impossible. 

But here is where the conflict between price-level analysis and the rest 
of general equilibrium theory can arise. It is the essence of general equilib- 
rium theory that what happens in one sector affects what goes on elsewhere. 
In particular, as we shall soon see again, Walras’ law provides a strong 
link between the monetary sector and the rest of the economy. Assumptions 
about the structure of the nonmonetary aspects (the so-called real sector) 
of the economy can therefore affect the mechanism which determines the 
price level. But a number of more or less plausible-sounding assumptions 
have been made about the real sector which, without its being realized by 
those who made them, served effectively to destroy any mechanism for the 
determination of the price level, for these assumptions make it impossible 
for any change in the price level to produce equilibrium or disequilibrium. 
That is, under these assumptions, if there is equilibrium with price level 
A, there will also be equilibrium, other things being equal, with any other 
price level, B, whereas if one price level produces disequilibrium, any other 
price level will produce disequilibrium! There is, thus, no such thing as a 
unique equilibrium level of prices. Any attempt to graft a price-level theory. 
onto this kind of system must produce a contradiction since such a theory, 
as we have seen, must state that some price level produces equilibrium, and 
the others disequilibrium. 

The assumptions about the real sector of the economy which have such 
unexpected and distressing effects on the monetary analysis are the pitfalls 
of the general equilibrium analysis which were mentioned at the beginning 
of this section. Let us examine these assumptions one by one and see how 
they lead to trouble. 


1. The homogeneity postulate. It has sometimes been argued that the 
demands for and supplies of commodities are affected only by relative 
prices, and not by their magnitudes in money terms. It makes no differ- 
ence to hat and shoe purchases, in this view, whether hats are $1 and shoes 
$3, or hats $5 and shoes $15. In both cases the relative price—the hat-shoe 
prise ratio—is three to one. The argument is that if, suddenly, the govern- 
ment were to double the face value of all coins and pieces of paper money— 
if all dollar bills were to have the legend “two dollars” stamped over their 
faces—nothing in the economy need) be affected except its accounting 
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records. Umbrellas would nominally cost twice as much, but the buyers’ 
dollar incomes would be twice as high. Thus, the argument runs, if we 
increase all money prices but raise them strictly in proportion, no com- 
modity’s supply or demand will be affected. 


This assumption has been called the homogeneity postulate, the ter- 
minology being drawn from the mathematical expression for a relationship 
in which a proportionate change in all of the variables (the prices) has no 
effect on the dependent variables (quantities supplied and demanded). 
Demand and supply relationships in which a proportionate change in 
prices leaves things unchanged are said to be homogeneous of degree zero in 
prices alone. 

This homogeneity assumption is inconsistent with the determination of 
any price level. It will be recalled that, by Walras’ law, if there is supply 
and demand equilibrium in every market except one, then the remaining 
market must also be in equilibrium. Hence, in particular, if demand equals 
supply in every market in the real sector of the economy (dem ind equals 
supply for every commodity except money), then the supply of sid demand 
for money must also necessarily be equal. In this case we must have an 
equilibrium price level. Suppose, now, that the price level changes, with 
all prices varying in exactly the same proportion. If the supply and demand 
relationships are homogeneous, then none of them will be affected by this 
change in price level—the demand for neckties will remain equal to the 
supply of neckties, the demand for pianos will remain equal to their supply, 
etc. This change in price level, with no change in relative prices, cannot 
disturb the equilibrium in any market of the real sector, so that, by Walras' 
law, the supply of and demand for money must also remain equal. Thus, it 
is impossible for any change in price level alone (leaving relative prices 
unaffected) to produce disequilibrium in the money market. In sum, we 
see that the homogeneity postulate, because it precludes the price level 
from affecting the real sector, also prevents it from affecting the money 
market. If we accept that assumption, it is impossible to have any monetary 
theory or any determinate price level—any price level will do as well as 
any other because they will all be equally consistent with equilibrium. 


2. Dichotomy of pricing in the real and monetary sectors. A second and 
closely related assumptior , which leads to similar difficulties, is the premise 
that it is possible to divide price determination into two completely inde- 
pendent parts—the determination of the absolute price level occurring 
entirely in the monetary sector of the economy and the determination of 
relative prices only in the real sectors of the economy. That is, relative 
prices are determined by the supply-demand equations for commodities, 
and then the price level is determined separately by the money supply- 
demand equation. 
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If this two-part, or dichotomized, price determination premise implies 
that the price level has absolutely no effect on the real sector of the econ- 
omy, it runs into the same trouble as the homogeneity postulate. If no 
change in price level can produce disequilibrium in any commodity market, 
by Walras’ law, it cannot produce money-market disequilibrium either. 
Hence, this extreme form of the dichotomous price determination assump- 
tion also produces a contradiction with any monetary theory. If the price 
level does not affect the commodity markets, it cannot be determined in any 
market. However, we shall see later that there is another closely related, 
but entirely legitimate form of this dichotomy assumption. 


3. Say’s identity. A third and final assumption which can preclude the 
determination of a price level is one of the several (no longer fashionable) 
propositions which have at one time or another been labeled Say’s law. 
This version of the proposition attributed to Say is the one which is found 
in most modern references. It asserts that people offer things for sale only 
because they want other goods and services in exchange. If they accept 
money for the goods they sell, they do not do so because they want the 
money for its own sake but because they desire to take the money at once 
and buy other goods with it. In this way, every supply of a good brings 
with it a demand for an equivalent amount of goods. Whether prices are 
high or low, rising or falling, the supply of all goods taken together must 
equal the demand for all goods taken together. This version of Say’s law 
is compatible with an overproduction of hula hoops or some other particular 
commodities, if, for example, people’s tastes have unexpectedly swung away 
from hula hoops. But the overproduction of these items must be matched 
by an undersupply of some other goods on which suppliers do want to spend 
their money. Thus, it is possible for producers to turn out the wrong goods 
but they can never turn out too many goods for the buying public. General 
overproduction of commodities is impossible. 


This version of Say’s law is a first cousin of Walras’ law which states 
that supplies of goods and money together must equal demands for goods 
plus money. Say’s law is the stronger assertion that people don’t want 
money except to buy goods at once, so that the total supply of commodities 
alone (excluding money) is necessarily identical with the total demand for 
commodities alone. For this reason it has been proposed that this version 
of Say’s proposition be called Say's identity. 

Since Say's identity requires that the goods markets, taken as a whole, 
must always be in equilibrium (total supply for all goods equals total de- 
mand), it follows by Walras' law that the remaining market, the money 
market, must also always be in equilibrium. It is impossible for any change 
in the price level (or any change in anything else, for that matter) ever to 
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produce disequilibrium in the money market. As a result, Say’s identity 
also precludes the determination.of any price level by any relationship of 
monetary theory. 

To summarize, we have now examined three assumptions which have 
at some time or another appeared in the economic literature—the homo- 
geneity postulate, the dichotomization assumption, and the Say’s identity 
assumption. We have seen that any one of these causes serious trouble for 
general equilibrium theory by making a monetary theory impossible. 
Only in a barter economy where money plays no role and there is no absolute 
price level can any of these assumptions be made without causing such 
difficulties. 


5. The Real Balance Effect 


In practice, a change in the price level can have very profound effects 
on demands for and supplies of commodities. Perhaps the most powerful 
influence of a price change is that which operates through the public’s 
expectations. A rise in price level has, for example, been known to stampede 
buyers into purchasing goods in the fear that their prices will go up even 
further. Thus, a price change, by leading buyers or sellers to expect further 
price changes in the future, can induce them to speed up or to postpone 
purchases and sales of commodities. 

Another influence of a change in price level is its effect on the purchasing 
power of a stock of cash. If a man has $1,000 in cash, and prices fall by half, 
the value (purchasing power) of his stock of cash will have doubled. More 
generally, it can be seen that a rise in prices will lower the real value of 
cash holdings (it will reduce the real wealth of the owners of the money) 
whereas a fall in prices will raise the real value (purchasing power) of cash 
holdings. A proportionate fall in all prices may leave real incomes unaffected 
(if wages and commodity prices both fall 50 per cent, the purchasing power 
of workers’ incomes are unaffected). But we see that the fall in prices must 
increase the real wealth of people who hold cash. 


3 In algebraic terms Say’s identity may be written, using the notation of Walras’ 
law identity (1) in Section 3, as 


2053 2053 


È PS: = Y PU. 


i=l i= 


This differs from the Walras’ law identity only in one respect. The numbers above the 
’s are 2,054 in the Walras’ law case but they are 2,053 here. That is so because the 


supply of and demand for the 2,054th commodity, money, do not enter into Say’s 
identity. 
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Let us now make the reasonable assumption that an increase in a 
person’s wealth will lead him to increase his expenditures either on con- 
sumption or on investment goods, even if only by a small amount. Then 
the fall in price level must increase the public’s demand for goods and 
services because, as we have just seen, it increases the real wealth of cash- 
holders. Thus, changes in the price level must also affect demands for and 
supplies of goods. 

The effect of price-level changes on demands and supplies which operates 
through the resulting change in the purchasing power of cash has been 
discussed in the literature under a variety of names. In different places 
it has been called the Pigou effect^ the real balance effect (the effect 
which operates via the purchasing power of cash balances), and the wealth- 
saving relationship (the effect of the change in real wealth on saving and 
expenditure). 

The real balance effect amounts to a direct denial of the homogeneity 
postulate and the dichotomization assumption which were described in the 
previous section, for through this effect, a change in price level (a propor- 
tionate change in all prices) does cause variation in demands for and sup- 
plies of goods (unless, of course, cash stocks were also to change in the 
same proportion). In this way, absolute prices play a role in the real sector 
of the economy and not just in the money market. The real balance effect 
is also incompatible with Say’s identity for it implies, e.g., that a sufficiently 
large rise in the price level can lead to such a reduction in the purchasing 
power of cash-holders, so that the demand for goods will, taken as a whole, 
fall below the supply—there will be general overproduction—a phenomenon 
which, Say’s identity asserts, is impossible. 

The real balance effect is an essential piece of the machinery which 
works to produce equilibrium in the money market. Suppose, for example, 
that for some reason prices fall below their equilibrium level. This will 
increase the real wealth of cash holders, lead them to spend more money, 
and that in turn will drive prices back up toward equilibrium. Thus, the 
real balance effect is a part of the equilibrating mechanism of the money 
market. It is a force behind the working of the quantity theory or whatever 
analysis we wish to use to explain the determination of the equilibrium price 
level. 


6. Comparative Statics: General Equilibrium Analysis 
Suppose, however, we were to decide to ignore problems of disequilib- 
rium and confine our attention only to general equilibrium problems. For 


example, if there is an influx of money into the economy, we may ask not 


4 After A. C. Pigou, to whom the idea was attributed. 
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about its impact effects, but rather, what it will have done to the economic 
Situation after the economy has had a chance to adjust to the new money 
supply—after it has reached its new equilibrium. This approach is called 
the method of comparative statics. If an exogenous change occurs, we do 
not, as in dynamics, trace out the course of the resulting developments as 
time passes. Rather, we compare the initial equilibrium Position only with 
the new equilibrium which might eventually result from this change. We 
compare only two static equilibrium situations. 

In such a comparative-statics analysis, it turns out that the real balance 
effect may lose much of its importance so that in such cases a modified sort 
of homogeneity postulate and dichotomization assumption may become 
legitimate. 

Suppose there is a doubling of the supply of money. Suppose, more- 
over, that after the smoke has had a chance to clear away, all prices will 
have doubled, and the demand for money will also have doubled. In this 
situation there is no change in relative prices, and the supply and demand 
for money are once again equal (at twice their original level). The pur- 
chasing power of stocks of money will have been reduced back to its original 
level (there are twice as many dollars but each dollar is worth only half 
as much as it was). 

With all relative prices unchanged, and with the purchasing power of 


The change in money supply will thus have affected the absolute price 
level, but it will not have had any effect on the real sector of the economy. 
Equilibrium relative prices and commodity supplies and demands will all 


We see that these assertions come very close to the homogeneity postu- 
late (price-level changes do not affect demands for and supplies of goods) 
and the dichotomization assumption (events in the money market deter- 
mine only the price level and have no effect on the real sector of the econ- 
omy, which can be analyzed separately). However, these two assumptions 
are legitimate only in their modified form in which they refer just to equi- 
librium prices and equilibrium supply and demand levels. Only in such a 
comparative-statics context are these premises acceptable. It is important 
to recognize that the classical and neoclassical economists often used equi- 
librium concepts and comparative-statics techniques without specifying 
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explicitly that they were doing so. When homogeneity and dichotomiza- 
tion assumptions are encountered in their writings, it is necessary to 
recognize, therefore, that they may well have been meant in an innocuous 
comparative-statics sense so that no error need have been committed at 
that point in their discussion. 

This section has argued that only where questions of dynamics and 
disequilibrium are considered do the homogeneity and dichotomization 
assumptions necessarily run into trouble. However, the reader should realize 

‘that even equilibrium supplies and demands may be affected by price-level 
changes. If in the course of a price rise there is, for example, a redistribu- 
tion of real wealth, as will often be the case, demands will shift in accord 
with the tastes of those whose purchasing power has been increased. 
All that the preceding argument has shown is that it is possible for equilib- 
rium demands to be unaffected by a price rise so that there is no logical 
impossibility in the homogeneity postulate when applied to equilibrium 
supplies and demands. It has not been maintained, even in a comparative- 
statics analysis, that this postulate must always, of necessity, be valid. 


7. Optimal Cash Balances 


Before leaving our discussion of the theory of money, let us inquire a 
little more closely into the structure of the demand for money. 

Keynes, in his General Theory, divides the demand for liquid funds into 
three categories: that which is desired for transactions purposes (the cash 
needed to meet foreseen payments like a firm's payments which are re- 
quired by contract); that for precautionary purposes (cash needed to meet 
payments whose magnitude is not known in advance) ; and that to be used 
for purposes of speculation. Keynes implies that the demand for cash for 
transactions and precautionary purposes will be rather strongly responsive 
to changes in expenditure levels, but that these demands will be relatively 
interest inelastic.® 

It is, of course, possible to accept this as an assumption or as an impres- 
sion garnered from observation, but usually, on such a question, the 
theorist prefers to probe somewhat more deeply. Why should people and 
firms keep more cash when their expenditures rise? And if there is a reason 
for their balances to be increased in these circumstances, what determines 
the amount by which their money holdings should rise? Finally, we may 
well ask whether interest rates should not also influence significantly the 
magnitudes of these cash holdings. These are all questions of good manage- 
ment. Cash is kept not for its own sake but because it helps the consumer 


5 See J. M. Keynes, The General Theory of Employment, Interest and Money, Ber court, 
Brace & World, Inc., New York, 1936, pp. 196-97. 
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and the businessman to carry on his activities. The questions to be asked, 
then, are whether there is some way in which an optimal cash balance 
can be computed and, if so, how this optimal cash balance figure will be. 
affected by changes in incomes and interest rates. These are obviously im- . 
portant questions for business management, as well as for economic theory.* 
A firm’s cash balance can usually be interpreted as an inventory—an 


inventory of money which its holder stands ready to exchange against: <= 


purchases of labor, raw materials, etc. It is really no different in principle 
from a shoe manufacturer’s inventory of footwear which he stands ready 
to trade for the distributor’s cash. The reason for comparing cash on hand 
with a commodity inventory is that we already possess a body of techniques 
for determining optimal inventory levels? These techniques can be used 
to balance off the advantages of a sizable cash balance against its costs. 

It is, of course, convenient to keep a sizable cash balance on hand be- 
cause that can make it so much easier to meet required disbursements, 
particularly because it is not always possible to foresee in advance the 
precise magnitudes of required expenditures. But it is expensive to tie up 
large amounts of capital in the form of cash balances. For that money 
could otherwise be used profitably elsewhere in the firm, or it could be 
used to pay off debt and reduce the firm’s interest burden, or the money 
could be invested profitably in securities. When tight money limits the 
funds which are in practice available to the businessman, he must recognize 
that every dollar he keeps in the form of cash on hand means one dollar 
less available for the purchase of labor, raw materials, etc. 

To see precisely how the optimum cash inventory computation is 
handled, let us go directly to the calculation of the optimal level of that 
portion of a company’s cash inventory which is used to meet payments 
whose magnitude is known in advance. Suppose the company receives 
$80,000 in cash on the first day of each month which it will pay out in 
regular daily installments over the next month. Rather than keep all 
of this cash idle, some of it can be invested in securities, say at a return of 
5 per cent. But each time some cash is invested or withdrawn there is a 
fixed brokerage charge, say $25. The company may then consider the three 
alternatives shown in Table 1 for a four-week month. 

Notice that as the frequency of withdrawals increases, the average 
investment goes up from 0 to $20,000 to $30,000; thus, the annual interest 


5 Much of the analysis which follows is based on my article, “The Transactions 
Demand for Cash: An Inventory Theoretic Approach,” Quarterly Journal of Economics, 
Vol. LXVI, November 1952. See also James Tobin, “The Interest Elasticity of Trans- 
actions Demand for Cash," Review of Economics and Statistics, Vol. XXXVIII, August 
1956. 


7 See the illustrative inventory analysis of Chapter 1 and the list of references at the 
end of that chapter. 
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earnings at 5 per cent rise from 0 to $1,000 to $1,500. But in method A 
there are no brokerage charges. In method B there is one investment and 
one withdrawal per month, or 24 broker transactions per year, which result 


TABLE 1 
Week Average 
Investment 
1 2 3 4 Holding 
Possibility A: No investment (zero broker transactions per month) 
Investment purchases 0 0 0 0 
Investment holdings 0 0 0 0 0 
Withdrawals 0 0 0 0 
Payments $20,000 $20,000 $20,000 $20,000 
Possibility B: Two broker transactions per month 
Investment purchases $40,000 0 0 0 
Investment holdings $40,000 $40,000 0 0 $20,000 
Withdrawals $40,000" 0 $40,000 0 
Payments $20, 008 $20,000 $20,000 $20,000 
Possibility C: Four broker transactions per month 
Investment purchases $60,000 0 0 0 
Investment holdings $60,000 $40,000 $20,000 0 $30,000 
Withdrawals $20,000* $20,000 $20,000 $20,000 
Payments $20,000 $20,000 $20,000 $20,000 


——MÁá— à EE. 


* This amount is in fact never invested or withdrawn—it represents the amount 
withheld from the initial investment. 


(at $25 per transaction) in a total brokerage fee of $600. Method C requires 
four investment and withdrawal transactions per month, or forty-eight 
per year, which will cost just, $1,200. Thus we have the results shown in 
Table 2. 


Clearly, method B is the more profitable way for the firm to manage its 
cash. 

More generally, it is possible to show how the optimum balance (inven- 
tory) of cash not held in short-term investments will increase when the 
volume of transactions or the brokerage fee increases, and decrease when 
the interest rate increases. The inventory analysis of Chapter 1 indicates 
that these will not be proportionate variations. For example, the optimal 
cash balance will increase only as the square root of the volume of trans- 
actions—i.e., there will be economies of large scale in the firm’s optimal 
cash balance. 
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TABLE 2 
———MM—M M— M: d 
Net Gain 
Broker (Interest. 
Annual Trans- Annual Minus 
Average Interest actions Broker Broker’s 
Investment Earning Per Year Cost Cost) 
Method A 0 0 0 0 0 
Method B $20,000 $1,000 24 $ 600 $400 
Method C $30,000 $1,500 48 $1,200 $300 


ee eee — 


The reasons for this result can be suggested without the aid of mathe 
matics. Most important, it must be noted that a given volume of payments 
can be met with different cash withdrawal levels. We observed that the 
$80,000 could be paid by keeping the entire $80,000 on hand, or by investing 
it and withdrawing $40,000 twice a month, etc. In other words, even when 
the firm’s total payments are fixed, the average cash balance used to 
meet these payments can be varied. We can see why this amount will 
vary directly with the value of the brokerage fee and inversely with the 
interest rate. Clearly, if the brokerage fee goes up, it will pay to cut down 
the number of withdrawals, i.e., the optimal cash balance will rise. Simi- 
larly, if the level of the interest rate goes up, it will pay to make withdrawals 
as small and as late as possible, i.e., the optimal balance of idle, noninterest- 
earning cash will fall. In sum, if firms are efficient profit maximizers, Keynes 
was probably wrong in playing down the influence of the interest rate on 
the transactions demand for cash. 


In addition, we now have a firmer foundation for his view that the 
demand for cash should increase with the volume of transactions. But 
why should they not increase proportionately (as Keynes suggests)? That 
is—why should the most economical cash holding increase relatively less 
than the volume of expenditures which this cash is used to finance? The 
answer is to be found in the nature of the cost of investment transactions. 
The minimum broker's fee is what makes it unprofitable to take cash out 
of investments in frequent small driblets, although doing so will keep cash 
invested until the last possible moment. But the larger the amounts in- 
volved, the smaller, relatively speaking, will be the brokerage costs. On a 
$1,000 bond purchase, minimum brokerage fees can be costly. On a 
million-dollar transaction they are negligible. Hence, the larger the total 
amounts involved, the less significant will be the brokerage costs, and the 
more frequent will be optimal withdrawals. For this reason optimal with- 
drawals and cash balances will rise when the volume of transactions per 
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firm increases, but will rise less than in proportion with the volume of 
transactions payments. 

For expository simplicity, this discussion has assumed that any reduc- 
tion in cash holdings is used to purchase short-term securities. In practice, 
tight money means that frequently funds can more profitably be invested 
inside the firm. This does not alter the nature of the analysis in any funda- 
mental way. The same methods can be employed to take this fact into 
account in determining the way in which the firm can use cash most effec- 
tively and most economically. 

It is also interesting to note, without any attempt at explanation, that 
on not entirely implausible assumptions, somewhat similar results can be 
derived for the precautionary demand for cash, though the method of 
analysis is considerably different from that which has just been described. 

Some final remarks are appropriate to tie in the discussion of this sec- 
tion with the rest of the chapter. We have seen throughout the general 
equilibrium discussion how large a role was played by the demand for cash 
balances. Now, with the aid of inventory analysis we have been able to 
make some deductions about the nature of that demand. 

In particular, we can now say something about the relationship between 
changes in the price level and the demand for cash balances. Suppose 
prices rise by 37 per cent; will people end up demanding 37 per cent more 
money as is so often assumed in the general equilibrium discussions? Our 
inventory model tells us that (if the pattern of the cash-holders’ pur- 
chases does not change)® they will—that optimal cash balances will in- 
crease in precisely the same proportion as the price level. If price level 
goes up by this percentage, the money value of the buyer’s transactions 
will clearly also rise by 37 per cent, and this might lead us to suspect that 
the demand for cash will rise by only a smaller proportion. But a uniform 
rise in prices means that brokerage fees will also rise by 37 per cent so that 
larger cash balances will become desirable in order to avoid investments 
and withdrawals and the brokerage costs which they incur. The two effects 
together—that of the increased money value of transactions and that of 
the increased brokerage fee—can easily be shown to lead to a rise in the 
optimal demand for cash in precise proportion with a change in the price 
level? 


8 This implies either that full equilibrium has been achieved or that the real balance 
effect can, for present purposes, be ignored. 

? To prove this the reader need merely glance at the final expression for the optimal 
inventory in Section 6 of Chapter 1. This result states that 


cal 2a*Q* 
D - 4| p 


In terms of the current discussion we can interpret D (or, rather, D/2) as the optimal 
average cash inventory, Q* as the volume of transactions (payments to be met), k* as 
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the interest cost of carrying cash (the interest rate), and a* as the brokerage fee (reorder 
cost). Then, if brokerage fee and the volume of transactions are each increased by a 
factor of W (each rises to W times its former level), the expression under the square 
root sign then rises by W? so that the optimal cash balance also goes up to exactly 
VW? = W times its initial level. 


General Equilibrium 
and Welfare Economics 


2l 


Welfare economics is the branch of economic theory which has 
investigated the nature of the policy recommendations that the economist 
is entitled to make. Its literature has mostly discussed two types of subject: 
(1) the fundamental but quasi-philosophical problems involved in distin- 
guishing a “legitimate” from an “illegitimate” recommendation and (2) 
the construction of a theoretical framework which can be applied to some 
actual policy problems. We shall be concerned primarily with the latter, 
leaving the first, more methodological problem until the end of the chapter. 


1, Resource Allocation and General Equilibrium 


Welfare economics has concerned itself mostly with policy issues which 
arise out of the allocation of resources—with the distribution of inputs 
among the various commodities and the distribution of commodities among 
the various consumers. 

This is a general equilibrium problem because if resources are moved 
into one industry, they must presumably be taken out of another, and the 
interrelationships of the two industries constitute the heart of the matter. 
The problem of determining the optimal outputs of the various com- 
modities produced in the economy arises only because the quantities of all 
resources are limited. In such circumstances, it is no answer to say that 
more of any commodity is a good thing. If we produce mors guns, there 


496 


Part 4 General Equilibrium and Welfare Economics 497 


will be less farm labor available to produce butter. It may be highly un- 
desirable to increase the output of product a because the required con- 
comitant decrease in product b is (on some criterion) more valuable. The 
optimal allocation of resources between the two items is a matter of the 
relative urgency of the demands for them and their relative costs of produc- 
tion. No product's optimal output level can therefore be determined in 
isolation but only in a comparison with other commodities with which it 
competes for society’s limited resources. This is the basis for the conclusion 
that, at least in principle, resource allocation is necessarily a matter for 
general equilibrium analysis. 


2. The Maximands: Consumers’ and Producers’ Surplus 


Before getting to the specific criteria of optimality that have been 
derived by welfare economists, we must first ask what is meant by op- 
timality of resource allocation—what it is that society can be taken to be 
maximizing. Here we are not concerned with the matter as a philosophical 
issue—that aspect of the subject will be touched on later in the chapter. 
Rather, the issue is operational. As elsewhere in microeconomics, optimiza- 
tion is taken to denote maximization of the value of some objective 
function. In the conventional theory of the consumer that maximand is the 
utility function. In the conventional theory of the firm it is the profit 
function. What can we use for the purpose in dealing with the welfare of 
the entire community? 

Whatever their shortcomings, two approaches have so far proved most 
fruitful. One inyolves maximization of consumers’ and producers’ surpluses, 
and the other is the approach which can be described as Pareto optimality. 
These will be discussed in turn in this and the following section. 

The notion of consumers’ surplus always seems on a superficial view to 
involve a bit of flimflam. The idea is that every consumer gets out of each 
transaction something more than he pays for the item he purchases. This 
must be so because no one forces him to make a purchase.’ If he were to 
end up with no net benefit, he would not bother to make that purchase. 
This is the secret of the mutual gains from trade which even Marx was at 
pains to emphasize.” In a voluntary trade both parties must end up with a 
net gain even though their total holdings after the trade must be exactly 


as much as they held between them before. The explanation of this bit of 


1 One may be tempted to argue that one is forced to buy food no matter what its 
cost. But that is equivalent to saying that since it preserves the lives of the members of 
our family it is priceless and certainly worth more to us than the price we pay for it no 
matter how exorbitant we may consider that price. 

2 See Capital, Volume I, Charles H. Kerr and Co., Chicago 1906, p. 175. 
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magic, of course, is that each has obtained from the other something he 
valued more in exchange for something he valued less—I traded some milk 
which I do not like for some cheese which I find delicious. Since we have 
the opposite tastes, we both obtained a surplus from the transaction. 

Thus, the notion that each consumer obtains a surplus from each 
voluntary transaction involves no moral judgment about the desirability 
or fairness of the prices that are charged or the generosity (or rapacity) 
of the seller. It merely is an observation about the nature of voluntary 
participation in economic transactions. 

Since in the last analysis all economic activity involves giving up 
something (money or physical inputs) in exchange for something else, it 
seems an appropriate objective to maximize the sum of the net gains, i.e., 
the surpluses obtained from consumption and production. The problem is, 
how does one measure these net gains? Consumers’ surplus is the econo- 
mist’s answer to that question, one which often permits him to make actual 
numerical estimates in practice. 

Jules Dupuit, who is credited with the invention of the concept,® 
proposed that one measure the net benefit in monetary units and that one 
use as the measure of consumers’ surplus the area between the demand 
curve and the horizontal line indicating the price paid for the commodity. 
The reason is simply this: since consumer equilibrium requires the con- 
sumer to equate the price of a commodity with its marginal utility (mea- 
sured in money),* the demand curve for commodity i becomes a curve of 
marginal utility of that good. In terms of Figure 1a, since quantity za will 
be bought at price pa, the marginal utility of X at quantity £a must be 
taA = pa. The total utility of that quantity of X must therefore be the 
area under the curve between the origin and Ta, i.e., it must equal Oz, A D, 
for it must be the sum of the marginal utility rectangles such as those 
marked B, C, and D, representing, respectively, the marginal utility of the 
first, second, and third units of X. The amount the consumer pays if he 
purchases Ta is price times quantity, i.e., the rectangle Oz; Ap,. The con- 
sumers' surplus from this purchase is the difference between these two 


3 J. Dupuit, “On the Measurement of the Utility of Public Works" (1844), translation 
in K. J. Arrow and T. Scitovsky (eds.), Readings in Welfare Economics (American 
Economic Association), Vol. XII, Richard D. Irwin, Inc., Homewood, Ill., 1969. 

‘Strictly speaking, the equilibrium requirement is Dips = mu;[mus, where p; is 
the price of good 7, p» is the price of money, and mu;/mum is the marginal rate of sub- 
stitution of money for good i; that is, it tells us how much money the consumer is willing 
to give up for an additional unit of i. But since the price of moncy is unity (how many 
dollars exchange for a unit of U.S. currency?), we have Pm = 1, and the preceding 
equation becomes p; = mui/mum = marginal utility of i measured in money. 


Figure 1 


areas, i.e., the difference between the money value of the total utility of 
his purchase and the money he actually pays for it. It is therefore repre- 
sented by? the roughly triangular area, p,A D. 

The preceding measure of consumers' surplus offers a great advantage 
in application. One can often at least approximate its magnitude empir- 
ieally. Where a reliable econometric estimate of the demand curve is 
available one can measure its area, thereby obtaining a calculation of 
consumers’ surplus. 

Even if one does not know the entire demand curve, one can determine 
a great deal from a partial knowledge of its shape. For example (Figure 1b), 
by knowing only the segment of the demand curve that is shown, one can 
evaluate the gain to consumers of an innovation that permits price to fall 
from p, to p». That gain will be indicated by the resulting expansion in the 
area representing consumers’ surplus, which is represented by psB Apa. 

There are some theoretical objections to this procedure which are 
potentially significant but which, fortunately, seem likely to be un- 
important in many cases. We have measured total utility and, hence, 
surplus in money terms. But at different points of the demand curve the 
consumer will be left with different amounts of money, and so the value 


(marginal utility) of money to him will have changed. We are back at the 
rubber-vardstick problem. ê 


5 Similarly, producers’ surplus is the difference between total revenue from his 
output, minus the area under his marginal cost curve.. Under pure competition, the 
producers’ surplus is captured by the landlord in the form of rent (see Chapter 24). 
That is how it is possible to have a producers’ surplus and yet zero profit in competitive 
equilibrium. 

° We can get rid of the problem in theory by keeping the consumer’s real income 
constant so that its value to him does not change. In this case we have eliminated the 
income effect and our demand curve becomes a compensated demand curve (Chapter 9, 
Section 15). Consumers’ surplus can be measured as the area under this compensated 
demand curve, but, unfortunately, we have no way of observing this compensated 
curve statistically. 
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The issue is brought out by an indifference diagram which also illustrates 
some alternative ways of measuring consumers’ surplus. In Figure 2 the 
consumer is in equilibrium at point a, where his budget line pp’ is tangent 
to indifference curve, J. If he were not permitted to buy the commodity, 
he would end up at point p (with Op dollars and zero units of good X) on 
lower indifference curve J. Obviously, a quantity of money sufficient to 
get him from lower curve J to higher curve I is a measure of consumers’ 
surplus. But what is this amount? Is it quantity of money pc on the vertical 
axis? Or the length ba above Ta? Or the vertical distance between the two 
indifference curves at some intermediate location? These will not all be the 
same unless the indifference curves are parallel throughout. The point is 
that the amount of money that constitutes a given gain in utility to the 
consumer varies with the amount of money (and X) in his possession, and 
so there is no one correct money measure of the net benefit he gets from his 
purchase of X, and hence there is no one correct consumers’ surplus figure. 
Only if the variation in the money measure is small, i.e., if the indifference 
curves are nearly parallel, can one calculate the consumers’ surplus from 
a demand curve with any degree of confidence.” 


MONEY 
c 


Xa P' x 


Figure 2 


7 Robert Willig of Bell Laboratories in his important work on the subject has 
designed operational criteria which permit the analyst to judge from observed data how 
large the error in a consumers’ surplus can be in any particular case. Moreover, he 
concludes “‘... it is clear that in most applications the error of approximation will be 
very small. In fact, the error will often be overshadowed by the errors involved in 
estimating the demand curve.” R. D. Willig, "Consumers Surplus Without Apology,” 
American Economic Review, Vol. 65, 1976. 
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3. Pareto Optimality and Productive Efficiency 


An alternative approach to the choice of social maximand which has 
proved very fruitful analytically is associated with the work of Vilfredo 
Pareto. Again leaving its underlying philosophy to a later section, we will 
discuss it here only as an operational concept. Taking the objective of 
society to be in some sense the maximization of the welfare of all of its 
members we run at once into an intractable problem. We simply cannot 
add up the utilities of different individuals, any more than we can add up 
its production of salami and its output of brandy. Even if we were to 
know how to measure absolute pleasure (utility) for each individual 
(which ordinalists deny), we certainly would not know how to compare 
4 units of individual a’s utility with 6 units of utility for b. 

A similar problem obviously arises in the aggregation of consumers’ 
surplus for different individuals. Even though consumers’ surplus is 
measured in a unit that is ostensibly common, i.e., money, it is apparent 
that a dollar means different things to different individuals. We may not 
know how to prove that $10 is worth more to a poor man than it is to a 
wealthy vice president of the United States, but few of us would be 
prepared to deny it. Therefore we are hardly comfortable in arguing that 
a given proposal is socially desirable even though it reduces the poor 
person's consumer surplus by $8 because it increases the wealthy in- 
dividual’s surplus by $12, for is this really a net gain of $4? One can, of 
course, assign different weights to different individuals in the process of 
aggregating consumers’ surplus, but how such weights should be chosen 
is not quite clear. 

The welfare economist therefore retreats to a second line of argument 
which does not offer us the possibility of measurement that a consumers’ 
surplus approach often provides, but it does often yield significant results 
nevertheless. This alternative approach asserts that, at the very least, a 
social optimum must not succumb to the error of the dog in the manger— 
that is, at the very least, a maximum must offer everything to any one 
individual that it can provide him without harming anyone else. More 
concretely, a very weak requirement for optimality is that selecting any 
one member of society arbitrarily (call him individual 1) his utility, u1, 
should be made as large as possible while making sure that there is no 
loss in the utility U2, **-, Um of any other one of the m persons in the 
community. 

This approach, which is referred to as Pareto optimality, then treats 
social optimality as a problem in constrained maximization, with the 
utility of some one person maximized and each other person's utility 


* 
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function serving as a constraint. That is, 


Definition: Pareto optimality: Letting z;; be the quantity of com- 
modity ? consumed by individual j, and letting that person's utility 
function be u; = f'(z, - -- , z,)), Pareto optimality requires that society 
maximize the utility of some arbitrarily selected individual, i.e., that it 


maximize u; = f (£11, * * +, 2n1) 


subject to the requirement? that there be no loss in the utility of any 
other individual 2,---, m 


uz = f"(zis ^-^, 3,2) = ke 


Ww =F" (rins 5**, mu) m ene 


In the next section we will see what sorts of results can be obtained 
from such a formulation. First we pause briefly to illustrate the flexibility 
of the approach by noting how it can be adapted to other problems. For 
example, since we can no more add up the outputs of different com- 
modities than we can add the utility of different individuals, we adopt an 
analogous approach to productive efficiency: 


Definition: Productive efficiency: Let yi = g'(T1i,*++, Twi) be the out- 
put of any commodity 7 using the quantity ri; of input k, etc. Then effi- 
ciency requires (for commodity 1, any arbitrarily selected commodity) 
that we 


maximize y1 = g'(ris s ss, Twi) 
subject to the requirement that there be no reduction in any other output 


Y2 = g*(riz, ^*^, Tug) = c2 


8 Actually it is appropriate to write the constraints as inequalities 
uj = f(t,- -s Znj) > hy 
since there obviously is no objection to a beneficial change in j’s utility. This requires 
the use of Kuhn-Tucker methods rather than the standard Lagrangian techniques of 
the differential calculus, and so we use the equality forms of the constraints as an 
expository simplification. 
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and the constraints given by the available quantity of each input 


Tite ttn = T (total amount of input 1 in the economy) 


Torte: + Tus = Ts (total amount of input w in the economy). 


It should be noted that to be Pareto optimal an allocation of resources 
must be efficient. For if it were not efficient, it would be possible to increase 
output 1 without a loss in any other output. Hence, if there is at least one 
individual in the economy who prefers more of item 1, it is possible to 
benefit that individual without harming anyone else by giving him the 
increased output of commodity 1. Thus, the initial situation cannot have 
been Pareto optimal. 

In sum, efficiency is necessary for Pareto optimality since the absence 
of the former means that society has not taken advantage of every oppor- 
tunity to benefit one person without harming others. However, the converse 
proposition is not valid. An allocation of resources may be efficient and 
yet not Pareto optimal. This is obviously so because society can turn out 
combinations of goods which are not ideally suited to the tastes of the 
individuals in the community and yet it can be efficient in the way it 
produces that nonoptimal combination. In a world of coffee drinkers it 
would not be Pareto optimal to produce lots of tea and no coffee, and yet 
that does not preclude efficiency in the production of tea! 

Obviously, Pareto optimality analysis sidesteps the issue of income 
distribution. Economists tried various approaches, none of them fully 
satisfactory, to the problem of income distribution.? But the marginal 
optimality rules, which generally rest on a Paretian foundation, themselves 
have benefitted little from this discussion. They remain either silent or 
prejudiced in favor of the status quo on the issue of income distribution and 
are, therefore, necessarily incomplete or unsatisfactory even on matters 
for which distribution is not the primary issue. Ultimately, the Paretian 
approach can be considered the welfare economists’ instrument par 
excellence for the circumvention of this issue. 


4. Optimal Distribution of Products Among Consumers!° 


Given the amounts of the various goods which have been produced, 
how can these commodities best be divided up among the members of the 


? For a review of the available analysis, see Amartya Sen, On Economic Inequality 
Oxford University Press, Ine., New York, 1973. 

10 For an excellent alternative discussion of the materials of the next few sections 
see F. M. Bator, “The Simple Analytics of Welfare Maximization," American Economic 
Review, Vol. 47, March 1957. 


, 
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consuming public? 1t would seem that this problem runs us right into 
insuperable problems involved in deciding who deserves what. But we can 
avoid this issue and yet arrive at 


Proposition 1. Pareto optimal allocation of goods among consumers: For 
any two products X and Y and any two consumers 1 and 2, consumer l's 
marginal rate of substitution of X for Y must be the same as that of 
consumer 2; that is, to both consumers the ratio of the marginal utilities 
of the two products must be the same.!' 


To show why this must be so, we note that if the condition were 
violated so that an additional unit of X were not worth the same number 
of units of Y to both consumers (unequal marginal rates of substitution), 
they could both benefit by a simple exchange. Suppose that to consumer 1 


11 Proof: Using the model of Pareto optimality described in Section 3, we see how 
far we can increase the utility, u; = f'(zi yi), of, say, the first consumer without 
reducing the utility, uz = f?(x2, y2), of the other, using any arbitrary utility index 
consistent with the preferences of the two consumers. Here zi is the amount of X in 
the hands of the first consumer, y2 is the amount of Y which the second consumer 
possesses, ete. Our task, then, is to 


maximize u; = f'(zi, yi) 


subject to 
f*(zo, yo) = ki (a constant), i.e., the second consumer must not be hurt, and 
titt: = X* s i 
i.e., the total available amounts of X and Y are fixed. 
a+ ys = Y* 


To find this maximum we form the Lagrangian expression (Chapter 4, Section 8) 
ua = far, V1) + Nf? o, Y2) — k8] + i + za — X*) + ACH + ys — Y*). 


Differentiating partially in turn with respect to zi, z2, yi, and y2 and setting each of 
the results equal to zero we obtain 


du, Of! dua af? 
—=—+r=0 — =u te =0 
Oz; az, TAS d [21 j 0x2 a 

ð of! 2 

dua _ af he = 0, oua a Pay, = 0 
ðy OY dy2 aya 


Now we know óf!/óz; = muzi (the marginal utility of X to individual 1, ete.), and 
solving the preceding equations for these marginal utilities we obtain 


muzı = —M; muy = —^j mu. = —X/^a; muy2 = —Ac/da- 
Thus by straightforward division 


mu. mu 0» 


; 
muy Muy? He 


which is Proposition 1. 
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an additional unit of X has the same utility as 1.7 additional units of Y, 
whereas to consumer 2 the marginal unit of X is worth 1.9 additional 
units of Y. If they trade, consumer 1 giving a unit of good X to 2 and 
receiving, say, 1.8 units of Y in exchange, each consumer must consider 
himself 0.1 units ahead. Each consumer receives a higher utility in exchange 
for a lower—a clear gain for both parties. This same sort of mutually 
profitable trade can for the same reasons always be arranged if the con- 
ditions of Proposition 1 are violated, so that the distribution of com- 
modities cannot then be optimal because some opportunities to improve 
the situation remain unused. 

The graph of the Pareto optimal solution to the problem of exchange 
turns out to be the contract curve in the Edgeworth box diagram (Figure 
3a). It will be recalled from our discussion of this diagram in Chapter 16, 
Section 9, that the length of the horizontal axis of the diagram represents 
the total quantity of X in the hands of both individuals together and that 
the length of the vertical axis is the total quantity of Y. With the lower 
left-hand corner serving as the origin for individual 1 and the upper right- 
hand corner as that for individual 2, any point in the diagram represents 
a distribution of the total X and Y between the two persons. If some point, 
A, represents the initial pretrade distribution of the goods between the 
two individuals, a trade is represented by a move to another point in the 
diagram. Any move from A to a point such as E in the shaded region must 
represent an improvement to both parties since it puts each one on a higher 
indifference curve relative to his origin. 

A Pareto optimal solution is represented by any point on the contract 
curve, CC’, which is the locus of points of tangency, T, between indifference 
curves of the two persons. It is clear that these are the points that satisfy 
Proposition 1, since at any such point the slopes of 1’s and 2’s indifference 


INDIVIDUAL 2's ORIGIN 


TOTAL Y AVAILABLE 


icine X AVAILABLE —> 
INDIVIDUAL IS ORIGIN 
(a) (b) 


Figure 3 
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curves must be equal; i.e., their marginal rates of substitution between the 
two goods must be the same. We see at once why any point such as B on 
CC" must be Pareto optimal and why any point such as A off CC’ is not. 
For we have seen that any move from A anywhere into the shaded area is 
mutually beneficial and hence A is not Pareto optimal (it does not provide 
the maximal benefits available to 1 without harming 2). On the other 
hand, any move from a point on the contract curve must put at least one 
person on a lower indifference curve. Thus B is Pareto optimal because it 
permits no change which benefits one party without harming the other. 
We conclude s 


(a) That there is never just one Pareto optimal solution; indeed, there 
is generally an infinity of Pareto optima, each given by a different point in 
the diagram, ie., each representing a different distribution of benefits 
between the two persons; 

(b) We are not entitled to conclude that every Pareto optimal solution 
is better than every nonoptimal solution. For example, we cannot compare, 
with the aid of the Pareto optimality analysis alone, the Pareto optimal 
point B with the nonoptimal point D since while individual 2 prefers the 
former, individual 1 prefers the latter. So without an explicit interpersonal 
comparison we simply cannot judge the relative desirability of the two 
distributions represented by the two solutions. 


Proposition 1 consequently tells us something about how commodities 
should be distributed among individuals without saying anything about how 
income should be distributed among them! It does this by telling the con- 
sumers that they should end up somewhere on the contract curve but 
dodging the important problem of their optimal location on the contract 
curve. The rule avoids talking about distribution by committing itself to . 
little more than the apparently trivial assertion that if you have a bottle 
of whiskey (which you dislike) and I have a fifth of gin (which I detest), 
why—let’s swap! However, the result should not be scorned. Many a sage 
has missed this apparently simple point and argued that whatever one of 
the traders gains he must have taken away from the other, so that there 
can be no net advantage from trade. Moreover, out of such apparently 
trivial statements are formed the axiom systems on which powerful 
theories are built and important insights gained. We shall see presently 
how the weak result of this section can be used as a basis for some con- 
clusions about the design of a rationing system. 


5. Optimal Use of Resouces in Producing Given Oufputs 


We come now by a completely analogous argument to a second marginal 
rule for social optimality: 
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Proposition 2. Productive efficiency in the allocation of several inputs: 
An efficient use of any two inputs z and j, in the production of outputs 
X and Y, requires that 


mpi./mpj. = MPiy/MP jy; 


i.e., it requires that the ratio of the marginal physical products of 7 and j 
in the production of X be the same as the corresponding ratio for com- 
modity Y. 


This is so because if equality does not hold, say if the first fraction is . 
larger, ? will be relatively more efficient in producing X than is j (? will 
have a comparative advantage in X production), and it will pay to shift 
some of input 7 into X production and out of Y production and to shift 
some of j the other way, from X to Y. The argument is precisely analogous 
with that involved in the preceding marginal rule. Suppose (as an illustra- 
tion) we have the figures 


mpiz = 20, mp;z= 8, mpy = 4, and mpjy = 2 


so that, as the reader can verify, the left-hand fraction in Proposition 2 is 
the larger of the two. Now move one unit of z out of Y production and 
into the manufacturing of X so that the output of Y decreases by four 
units. Next, move enough input j in the other direction to keep the output 
of Y on the same level as it was originally; i.e., move two units of j into Y 
production (where each unit produces two units of Y) so that the output 
of Y goes up by 2 X 2 = 4 units, back to its old level. The result is a net 
increase in the output of X (with the output of Y remaining unchanged), 
for X’s output has first been raised (Az)(mp;z) units = 1 x 20 = 20 units 
and then been decreased by (Aj) (mp;.) = 2 X 8 = 16 units, leaving a net 
gain of four units. The reader can try other illustrative figures and see that 
they always yield similar results.!? Moreover, we can instead keep the 
output of commodity x unchanged and increase that of y or obtain smaller 
increases in both outputs. 

The upshot is that if this last marginal equation is violated, a switching 
around of inputs can give us something for nothing—it can increase one or 
both outputs without any increase in input use! Hence, because there are 
such unused opportunities, the input cannot possibly be optimally em- 


12 The mathematical proof of this theorem is virtually the same as that in footnote 
11. The relevant maximand and the constraints were described in the discussion of 
productive efficiency in Section 3. From these the reader can form the Lagrangian 
expression for this constrained maximization problem and complete the proof of 
Proposition 2 by direct analogy with footnote 11. 
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ployed when this last equation is violated. Where the marginal analysis 
is applicable, production is therefore said to be inefficient when this equa- 
tion does not hold and efficient if the equation is satisfied (along with the 
second-order conditions). 

The diagrammatic representation of the analysis can again be ex- 
pressed in terms of the box diagram with the axes indicating the total 
input quantities available and the indifference curves interpreted as iso- 
product loci. The contract curve is, this time, the locus of efficient points. 
The matter can also be represented in terms of the production possibility 
locus (Figure 3b). Let the points in the shaded region represent every 
combination of outputs that can be produced with the available combina- 
tion of inputs. Then every point such as E that is inside the region is 
inefficient since it is possible to move from it to a point such as F at which 
more of each output is available. The northeast boundary of the region, 
curve SS’, is the locus of efficient points. Again note that there is an infinity 
of efficient points and that one cannot say that every efficient point is 
preferable to every inefficient point. For example, we cannot rank F and 
G from the information at our disposal in this discussion. 


6. Marginal Rule for Pareto Optimal Output Levels 


We come now to what is perhaps the most important and most con- 
troversial of the marginal rules of welfare economics. This is the rule which 
tells us how much coffee and how much salami should be produced with 
society’s scarce resources. 

The allocation of a limited quantity of resources among alternative 
outputs is dealt with by the following optimality rule, which is closely 
related to the one we have just discussed: 


Proposition 3. Optimal relative outputs: If resources are to be allocated 
optimally between any two outputs X and Y, then the marginal rate of 
substitution between X and Y for every individual j = 1, 2, -- , m who 
consumes some of each good must be equal to the ratio of the marginal 
(social) costs of production of the two goods. That is, we must have 


mul mul mur mc. 


z , 
mu, muy muy — mc, 


Here marginal cost of X, mc., can be interpreted to mean the value of the 
resources needed to produce an additional unit of X, etc. Note that the 
equality of the marginal rates of substitution muz/mu; for all consumers, 
j, is the requirement of Pareto optimality in the distribution of socicty’s 
outputs (Proposition 1). 
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The reasoning underlying Proposition 3 is so close to that behind the 
previous marginal rules that it need not be repeated. The reader should 
be able to show that if output levels are such that the equation is violated, 
total social utility can be incfeased by an increase in one output together 
with such a decrease in the other that there is no net change in total social 
costs (the use of society’s resources). So once again, if this equation is 
violated, society has missed an opportunity to get something (utility) for 
nothing, and therefore output levels cannot be optimal. 


7. An Optimal Price System 


Let us temporarily ignore the distinction between social and private 
costs and benefits. Then we can state 


Proposition 4: Suppose we institute a price system which has the fol- 
lowing characteristics: 


1. All inputs and outputs have fixed prices which are the same 
for every buyer and seller and which no buyer or seller can change. 

2. All quantities supplied are demanded and hence sold, i.e., 
these are the equilibrium prices of the general equilibrium system.!? 

3. Any firm can enter (or leave) the production of any commodity 
at these prices if it finds it profitable to do so. 


Then, under these circumstances, if every consumer maximizes his 
utility and every firm maximizes its profits, all of the preceding marginal 
optimality requirements will automatically be satisfied. 


This is a fundamental theorem of welfare economics. It is a refined 
version of the invisible hand proposition, and the reader would do well to 
convince himself how very remarkable it is. 

Outline of proof that Proposition 1 must hold under such a price 
system: We know (Chapter 9, Section 5) that if prices are fixed, each 
consumer will, if he behaves optimally, buy any two commodities X and 
Y in such amounts that for him the marginal rate of substitution of X for 
Y is equal to the ratio of the prices of the two items. Since these prices 
are the same for all consumers by characteristic (1) of our price system, 


13 Here we have to ignore free goods and goods which are a drug on the market. Not 
all the water near a large lake will find buyers, and desert land in the middle of the 
Sahara is likely to go begging for customers. This skips over the problem of deciding 
which goods will be free, for that cannot be known in advance. Whether or not some 
item will be unwanted depends on how much money customers have left over from their 
other purchases. See Chapter 23, Section 5. 
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the marginal rates of substitution of X for Y must be the same for all 
consumers, as Proposition 1 requires. Note that this result follows with 
any fixed prices, no matter how arbitrarily or randomly they are selected. 

Proposition 2 must be satisfied for exactly the same reason. As is 
shown in Chapter 11, if its input and output prices are fixed, it will pay a 
firm to hire any two inputs, z and k, in such proportions that the ratio of 
their prices equals the ratio of their marginal products. Proposition 2 
follows at once, since for any two firms, one of which uses the two inputs 
to produce X and the other to produce Y, we must have mp;z/mpx. = 
pi/px = mpi/mpi,. Again, it is remarkable that this result holds with 
any arbitrarily chosen prices! 

Proposition 3 is considerably more difficult to derive, and the argument 
will only be sketched in. In outline, the argument consists of showing that 
every industry will expand each of its outputs to a point where the ratio 
of their marginal costs is equal to the ratio of their prices. Moreover, all 
consumers will determine their purchases to a point where the marginal 
rates of substitution are equal to the products’ price ratios. Marginal cost 
ratios and marginal rates of substitution will therefore all be equal to the 
same price ratio and, therefore, equal to each other, as Proposition 3 
requires. 

The pricing arrangement of Proposition 4 does its job because it pro- 
vides the right signals to producers and consumers. For example, its 
relative prices are set equal to relative marginal costs (i.e., costs to society 
in terms of its resources). Thus, when the consumer divides up his budget 
in such a way as to get the most from his own money resources he also 
automatically makes the decisions that get the most out of society’s re- 
sources. He will not be induced to consume a set of items that gives him 
no more pleasure than another but which requires twice as large a quantity 
of resources to produce. A similar argument shows that these prices 
provide the correct financial signals to producers. 

However, the consumer side of the argument still suffers from a 
fundamental weakness arising out of the issue of income distribution. 
In effect, this procedure tells us to determine the allocation of resources 
on the basis of the marginal rates of substitution of the different con- 
sumers as indicated by their expenditures. This has been compared to an 
election in which some voters get to vote many times—where each dollar 
the consumer has to spend entitles him to another vote so that the wealthy 
can exercise an influence proportioned to the magnitude of their wealth. 
We decide that an optimal allocation of resources requires us to produce 
more marmalade for Ellen rather than more jam for Daniel if Ellen can 
afford to pay for it and Daniel cannot. Thus, in this interpretation of 
Proposition 3 we have not avoided committing ourselves on the question 
of a good distribution of wealth—rather we have, by default, desided to 
accept the status quo. 


T: 
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8. Pure Competition and Monopoly 


Using Proposition 4, which states that the price system of the preceding 
section guarantees the satisfaction of the marginal optimality rules, we 
can at once deduce several of the best-known results of welfare economics. 

First, there is 


Proposition 5: Perfect or pure competition will tend to yield an optimal 
allocation of resources. 


Under pure competitive equilibrium prices are fixed as far as any 
individual consumer or businessman is concerned ; supplies and demands 
are all equal, and, in the long run, all firms which can produce any product 
profitably will enter that industry. Thus all three requirements of the 
price system of the preceding section will be met by a pure-competition 
equilibrium, which must therefore satisfy all of the marginal requirements 
for an optimal allocation of resources. Hence, there is a presumption that 
under pure competition the allocation of resources will be optimal.!* This 
result and the material of the two previous sections constitute the elaborate 
superstructure which has been superimposed on the old commonsense 
notion that competition is a good thing because it prevents monopolistic 
exploitation of consumers and labor. However, one may well wonder 
whether the commonsense notion is still not more persuasive than its 
highly subtle and ingenious rationalization. 

The nature of the theoretical argument is brought out intuitively by 
contrasting the world of pure competition with one in which there are a 
few monopolistic industries. Since the price of the product falls as output 
expands, a monopoly will produce an output which is smaller than that of 
an otherwise identical competitive industry (unless there are very extreme 
economies of large-scale production), as was shown in Section 5 of Chapter 
16. This means that, given the total level of the employment, less resources 
will be used in the manufacture of monopolistically produced outputs and 
more of these resources will go into the remaining industries than would 
have been the case in an economy where pure competition was universal. 
Too little will be produced by the monopolies and too much by the com- 


14 Actually this is no more than a presumption, and there remain a number of flies in 
the competitive ointment. We know that any optimal allocation to which marginal 
analysis is applicable must satisfy these marginal rules, but the converse is not neces- 
sarily true—an allocation may satisfy these rules yet not be optimal. There are two 
specific and important sources of possible difficulty: The second-order conditions (see 
Chapter 4, Section 5) may not be satisfied, and social and private costs and benefits 
are unlikely to be equal throughout the economy. The second of these problems is 
discussed in Section 11 of this chapter. For an explicit and extremely lucid discussion 
of the relevance o the second-order conditions, see J. de Graaf, Theoretical Welfare 
Economics, Cambridge University Press, New York, 1957, esp. pp. 22-26 and 66-70 
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petitive industries. Resources will be overallocated to competitively 
produced commodities and monopolistic output restricted—to the social 
detriment. 

However, the reader should observe that the argument breaks down in 
a world of monopolies where there are no industries operating under a 
regime of competition. Given the level of employment of resources, a mis- 
allocation can arise only if the demand for inputs of one set of industries 
(the monopolists) is low in comparison with that of the remaining industries. 
If each of a number of runners slows down, none of them need come in 
ahead of the others, and if each industry is weak in its bidding for resources, 
no lopsided allocation of these resources need result. We see, then, that 
some competition may conceivably be worse than none!'? 


9. Centralized Planning Without Central Direction 


The theorem about the optimality results which can be achieved with 
the aid of the pricing system of Section 8 has had yet another application— 
to the economies of socialism and central planning. A number of theorists 
have argued that government planning does not have to involve elaborate 
controls and instructions, such as factory-by-factory production targets. +° 
The central authority need only compute a set of prices which satisfies the 
three conditions described in Section 8 and order factory managers to 
maximize their profits (profit maximization, too, could be made more or 
less automatic by basing managerial wages on the profits shown by the 
plants which they run). The result would then be automatic—a self- 
policing system for the achievement of an optimal allocation of resources. 
One would achieve the alleged economic benefits of central direction with- 
out the costly administrative burden and unpleasant bureaucratic inter- 
ference which goes with detailed central supervision. 

The “socialism by price guidance” proposal has come to be known by 
one of its characteristics, marginal cost pricing. Every firm would be forced 
to sell as much as it produced and to sell this output at a price equal to 
its marginal cost. This is the essence of the pricing arrangement. 

The idea is attractive, but it encounters a number of important diffi- 
culties. Perhaps the most important is that the equilibrium which results 
will maximize private rather than social net benefits. If, for example, it is 

important for the nation, on some criterion, to sacrifice some current con- 
sumer welfare for long-run economic growth, the price system of Section 8 
does not take these social goals into account. Other problems of differences 


15 This is an illustration of the theorem of the second best—Proposition 12, below. 
16 See, e.g, Lerner, op. cil, and Oskar Lange and Fred M. Taylor, On the Economic 
Theory of Socialism, University of Minnesota Press, Minneapolis, 1938. 
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in social costs and returns will be discussed presently. The upshot is that 
the proposed pricing system may fail to accomplish one of the main 
purposes of central planning—the achievement of social goals which are 
not reflected in private returns. 

Another somewhat more technical problem which plagues such a scheme 
for decentralized control arises if the firm’s average costs decrease when 
the scale of its production increases (increasing returns).!? If average costs 
are falling, by the standard rules of the average-marginal relationships 
(Chapter 3, Section 3), marginal cost must be less than average cost. 
Therefore, if the firm sells at a unit price equal to marginal cost, price must 
be less than average cost, i.e., unit costs will exceed unit returns so that 
the firm must lose money on each and every unit it sells! There is nothing 
the management of such a firm can do to make any profits, no matter how 
efficient its operations, if it sticks to a marginal cost price. Thus (even 
though it may sometimes be socially desirable to operate some industries 
which are unable to produce a profit) marginal cost pricing must, at the 
very least, lead to serious administrative difficulties in decreasing cost firms. 


10. Breakeven Constraints and Optimal Deviations Between Prices and Marginal 
Costs 


The discussion of the preceding section indicates that whatever its 
desirable properties, marginal cost pricing simply may riot be possible for 
society. If, for example, the preponderance of economie activity were 
carried out by single-product firms with declining average cost curves, 
then most industries would either end up losing money or charging prices 
not equal to marginal costs. Even if every product were sold by firms at its 
marginal cost and the government were willing to subsidize firms by 
covering any resulting losses, the problem would not be solved, because 
the funds for the subsidies must be derived from taxes. Any tax on a 
commodity (including an income tax, which can be interpreted as a tax 
on hours worked) must itself introduce a deviation between the priee of 
the product to the consumer and its marginal cost, for the price to the 
consumer is then equal to the producer's marginal cost plus the tax. 

The issue, then, is what is the second-best set of prices, given the 
undeniable fact that production costs must be covered from somewhere? 
The answer was supplied in 1928 by Frank Ramsey, a Cambridge philos- 
opher, who, unfortunately, died very young. The Ramsey theorem can 
be expressed in many forms. Two which are most widely recognized can 
be described as follows: Suppose we are given any two commodities 1 and 


17 This problem is closely related to the difficulties which arise when the second-order 
maximum conditions are not satisfied. 
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2 whose total cost is c(y1, y2), with marginal costs mc; and mcs and mar- 
ginal revenues mr; and mrz. Suppose we must meet a budget requirement 
such as 


Pii + P242 = (yr, Y2)- 
Then one form of the Ramsey theorem asserts that 


Proposition 6a: The Pareto optimal prices pi and po of two goods 
whose outputs are subject to a budget constraint must satisfy 
pier A mel 


pi — me; = A(mr; — Mex) or 
pa — mcg mrz — mcs 


That is, the optimal deviation between price and marginal cost will be 
proportionate to the deviation between the marginal revenue and marginal 
cost of that commodity. A second, and more widely known, form of the 


theorem asserts 


Proposition 6b: If E, and Ez are the elasticities of demand of two 
goods whose outputs are subject to a budget constraint and all cross 
elasticities of demand happen to be zero, then Pareto optimality of their 
pricing requires r 


piz me k or (py — mc))/m _ #2 
Di E; (po — mc)/p2 Ei 


This form of the theorem, which is known as the inverse elasticity formula, 
asserts that the optimal percentage deviation of the price of any item from 
its marginal cost (p; — mc;)/p; will vary inversely with the elasticity of 
demand for that item, E;. 

To understand the logic of this rule we must first consider what it is 
that is undesirable about a deviation of price and the corresponding mar- 
ginal cost. Suppose two products A and B have equal prices but A has a 
marginal cost twice as great as B’s. With their prices equal, in equilibrium 
the individual who buys some of each must obtain just as much marginal 
utility from a unit of the one as from the other. But in that case, the 
allocation of resources cannot be Pareto optimal, for since A costs twice 
as much in resources as B, a net gain can be achieved by shifting some 
f resources from B to A, producing two units of A for every 


quantity o : 
unit of B given up. Since the two goods have the same marginal benefit to 


consumers, obviously this change is a net advantage to them, and failure 
to carry it out is a violation of Pareto optimality. Every deviation of price 
from marginal cost introduces such a distortion in consumers’ demands by 
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leading them to purchase goods in accord with their relative money costs, 
which must then be different from their true social costs, i.e., their costs 
in terms of the resources used up in producing them. 

However, where society faces a budget constraint, as we have seen, 
prices must deviate from marginal costs. For example, where marginal 
cost pricing would yield a deficit because of widespread diminishing returns, 
prices may have to exceed marginal cost to prevent that deficit. But this 
can be carried out in a variety of ways with prices for different items 
exceeding their marginal costs by different percentages. The objective of 
Pareto optimality, now, is to find that set of deviations which meets the 
budgetary requirement while producing distortion in the allocation of 
resources no greater than the minimum necessary for the purpose. 

To determine what relative prices produce this minimal distortion we 
must consider relative demand elasticities for society’s different outputs. 
The statement that a product, 7, has a highly elastic demand while product 
k does not means that the former is highly sensitive to a given percentage 
price change while the latter is not. It follows that if prices must be changed 
from marginal costs but in a way that produces relatively little distorting 
effect on demands, then the bulk of the price rise should fall on items whose 
demands are comparatively inelastic. The more inelastic the demand for 
a good, the less a given percentage change in price from its marginal cost 
will distort its use from the Pareto optimal level. Thus, to meet the 
budgetary requirement, prices of items with inelastic demands should 
deviate from marginal costs by a relatively large percentage, and the 
reverse should be true for items with elastic demands. 

This result has played an important role in the recent literature of 
welfare theory, of taxation, and of the theory of regulation of public utility 
rates. In tax theory it is used in an obvious way to discuss optimal tax 
rates on different commodities given the size of the government’s budget. 
In the theory of regulation it is used to determine Pareto optimal relative 
prices for the various products of a public utility which meet the budgetary 
requirement that the firm earn just enough to cover its costs (including a 
return on its capital) but without giving it any monopoly profit. It can be 
shown, incidentally, that the theorem automatically calls for prices that 
are exactly equal to marginal costs where such a pricing policy does just 
happen to cover total costs. 

To outline the proof of the theorem, first recall that for any product, i, 
consumer equilibrium requires that the marginal utility (measured in 
money) of that commodity to each individual be equal to its price. Suppose 
now that we construct an artificial social utility function u(x, >- +, £n) 
also measured in money which is designed to have the same property, i.e., 


oe oc for all commodities (? = 1,---, n). 
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We assume that society’s objective is to maximize net surplus, which is 
the difference between this total money utility of its outputs and the 


(money) cost of the resources used in their production, c(yi,---, Yn). 
'Then the problem is to 
maximize u(y1, +++, Yn) — c(zi, +++, Zn) 
subject to the budget requirement which characterizes the problem: 
Piys + p2y2 +---+ DaYn = C(yy 5, Yn). 
This yields the Lagrangian function |writing u(-) for u(y;, --- , Yn), ete.] 
Uy = u(-) — el- ) + Mpiyi +---+ pays — cf- )). 


The first-order maximum conditions then include 


ar a valer n ale 
ðyı Oy  Oyi oy, 9yi 
mra camo n 
Yn Yn Yn Oy, IYn 


Thus, substituting (for any good k) the consumer’s equilibrium condition 
(which was just reviewed) p, = du/dy, and noting that by definition 
9c/0y, = mc, and à Y; Diyi/9y. = mr, these equations become 

Pı — mc, = —A(mr, — mci) 


Dn — me, = — Mmr, — me,), 


which immediately gives us the first Ramsey theorem.!? 


18 The second theorem is now also obtained directly by noting that when cross 
elasticities are zero, for example, mr, = à Gi t: Yapnr)/ð y, = pit y pı/ð y) = 
pı — (pı/Eı), where E, = —(p1/yı)ðyı/ðpı= price elasticity of demand. Thus, sub- 
stituting such expressions for mrs in the last set of equations in the text they become 


pi — mex = —Mpi — me; — pi/E;) 
or, collecting terms, 
(1 + A)(pi — mc)/pi = ME, 
which yields our second Ramsey theorem when we write k = A/(1 + à). 
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11, Beneficial and Detrimental Externalities of Production and Consumpiion 


Much of the discussion of this chapter requires reevaluation when we 
take into account divergences between private and social costs and returns. 
The basic idea behind the argument that the competitive price system is 
optimal is roughly that the businessman (like other members of the econo- 
my) can make money only by producing and marketing useful products and 
services. Hence he benefits only by benefitting the community, and, con- 
versely, by promoting his own interests he necessarily promotes those of 
the rest of the community as well. Unfortunately, there are many cases 
where this crucial premise breaks down—when members of the economy do 
things which benefit others in such a way that they can receive no payment 
in return, or where their actions are detrimental to others and involve no 
commensurate cost to themselves. In such cases of divergenze between 
social and private returns, self-interest and social interest do not coincide. 
This statement must be interpreted carefully—it is not meant to contrast 
the welfare of individuals with some sort of abstract “social good." Rather, 
it says that when each person independently pursues his own interests he 
may end up less well off than he would under an optimal arrangement. 
Where social and private returns do not coincide, it is possible that all 
members of society will lose out if each of them does his best to promote 
his own aims. 

Let us see, now, how such divergence between private and social returns 
are likely to arise and how they can lead to & misallocation of resources. 
Specifically, let us examine how they affect the theorems of the preceding 
sections. For this purpose it is convenient to deal, in turn, with four types 
of divergence between private and social returns: beneficial externalities of 
production, detrimental externalities of production, and beneficial and 
detrimental externalities of consumption. It should be noted that most of 
these are, in the last analysis, market imperfections—cases where the market 
offers no price for the provision of a service or a disservice. 

1. Beneficial externalities of production: This category of divergence 
between private and social returns has received a great deal of attention in 
the literature. The concept was first formulated explicitly by the English 
economist, Alfred Marshall, at the end of the last century. 

The case of economiés of large-scale production in which a firm can 
produce each unit of output more cheaply when it expands can be referred 
to as a case of internal economies—the benefits of the firm’s expansion 
are reaped internally, within the company. By contrast, the external 
economies case is one where an increase in the firm’s production produces 
benefits part (often a substantial part) of which devolve on others. 

This may arise in at least two ways: (a) By expanding its operations 
the firm may perform a direct service to others. The standard example is 
the training of a labor force. If one glass-blowing firm on the island of 
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Murano expands its operations, it may have to train more glass-blowers, 
who are potentially available for employment by its competitors, and those 
competitors will incur no training costs if they recruit any of those workers. 

(b) A second sort of external economy arises when an expansion in the 
operation of one company makes it cheaper to supply services to all the 
firms in the industry. A rise in the production of Ford automobiles will 
result in an increase in steel production. If there are internal economies in 
steel manufacturing (ordinary economies of large-scale production), steel 
prices may subsequently fall, and so Ford’s competitors also will obtain 
their raw materials more cheaply as a result of the increased output of 
Ford cars. In sum, economies which are external to the firm may arise 
when the expansion of one firm makes it cheaper for all firms in the industry 
to obtain their inputs. 

Both of these types of external benefits of large-scale production 
clearly involve divergence between private and social returns. The firm’s 
expansion makes it cheaper for other companies to operate, but under the 
prevailing price system there is no remuneration to the expanding firm for 
these benefits which it has conferred on others. 

2. Detrimental externalities of production: An expansion of the scale of 
a company’s operations can also have analogous disadvantageous effects 
such as pollution and other forms of damage to the environment as an 
incidental by-product of economic activity. If one company’s increased 
output leads it to keep more trucks in operation, it will crowd the roads 
and, in particular, make it more expensive and time-consuming for other 
companies to ship goods by truck. Increased fishing by one group depletes 
the supply of fish and makes it harder for others to obtain their catch. 
Increased use of water or more drilling of oil wells can make it harder for 
others to get these resources. Increased farming of land which erodes the 
soil very often makes it more difficult for neighbors to produce and main- 
tain the fertility of their territories. There is no need to add still further 
to these illustrations.! 

3. Beneficial and detrimental externalities of consumption: An increase 
in consumption can also cause analogous advantages or disadvantages to 
others—advantages or disadvantages which are not reflected in the returns 


19 These are all examples of what are called technological externalities—increased 
output by one enterprise requires the use of larger physical inputs by other firms to 
produce any given result. A different type of externality which has no such significance 
for welfare economics is called a pecuniary externality. This occurs when one firm, by 
increasing its output, causes a rise in the price of its inputs. That makes it more ex- 
pensive in money terms for other companies who use similar inputs. But it does not 
increase the social cost of their production because this production requires no larger 
input quantities or expenditure of time and effort than before. Increased use of leather 
by a shoe manufacturer may raise leather prices and hence the money costs of other 
shoe firms, but it need not make it any harder for them to make shoes. 
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of the person who produces them. As in the production case, these beneficial 
and detrimental externalities arise from interdependences which cannot 
readily be reflected in the pricing arrangements. A affects B’s welfare not 
just by delivering some goods to him and receiving money in return. 

For example, where the Smiths try to keep up with the Joneses, if 
Jones buys a new cream-colored Cadillac convertible, he makes lifé harder 
for Smith. Smith must now consume more than he did before in order to 
accomplish no more than just maintain his old level of satisfaction. Here 
is a detrimental externality of consumption. On the other hand, if I pur- 
chase more education for my children, I make them better citizens (or, at 
least, so educators would have us believe). This confers an advantage on 
others—it makes it possible for others to achieve a given level of satisfac- 
tion with a smaller expenditure of their own resources. In sum, any jncrease 
in consumption which makes the consumer a more efficient or inefficient 
producer, or any increase in consumption which affects the consumption 
patterns and desires of other buyers, will produce beneficial or detrimental 
externalities. If many women purchase more short skirts, they will make 
those with long hemlines feel increasingly dowdy and less able to resist the 
pressure to buy short skirts themselves; the fact that a book enters the 
best-seller lists may help promote its sales, etc.—in each case, any con- 
sumer’s purchase has had an indirect but very real effect on his fellows. 

Here, then, are some of the most significant classes of cases in which 
social and private returns diverge.?? Some of the illustrations may not seem 
to represent cases of great social significance. But one must not conclude 
that externalities play no major role in the economy. Taken together, they 
assume very great significance. The fact that a large industrial firm finds it 
incomparably easier and more profitable to operate in an industrial com- 
munity than in an underdeveloped area is in large part a result of external 
economies. The presence of other firms makes it far easier to run a plant— 
they bring with them a skilled labor force, financial institutions, organiza- 
tions which can efficiently supply technical services and raw materials, and 
so on. Indeed, the scarcity of operating firms and the external economies 
which they provide has often been cited as a major problem of the back- 
ward areas—without such an initial group of firms and the external 
economies of their operations, it is very difficult to get enterprises started. 
Industry is needed to encourage further industrialization. 

On the consumer side, the externalities are also very powerful. The 


20 Many other examples can easily be cited. The Piazza San Marco in Venice is the 
site of a case of an externally conferred benefit which, by overextension, incurs an 
external social cost. The cafes hire bands to serenade their customers. But, being out- 
doors, they cannot avoid providing music to the patrons of adjoining bistros. Un- 
fortunately, however, with four or five of these going at once it becomes impossible to 
hear the music anywhere. 
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tastes and the demands of consumers are very heavily conditioned by the 
societies in which they live. Food, clothing, housing, and many other 
tastes differ from country to country and from region to region, at least 
partly because the consumption pattern of one consumer is highly de- 
pendent on those of other members of society. In sum, the externalities of 
consumption are by no means negligible. 

It remains for us to see how these externalities affect the allocation of 
resources. The connection is easily suggested. If a firm or an individual 
makes a contribution to social welfare for which he receives no payment, 
he is likely to engage in this activity to a smaller extent than the interests 
of society require. If the production of some commodity confers external 
benefits, private enterprise may easily produce a less-than-optimal amount 
of this item. Company A, in deciding whether to expand its output, will 
not usually be led to do so by the fact that this will make things cheaper 
for companies B and C. 

Similarly, where there are detrimental externalities of production or 
consumption, private enterprise will perhaps overallocate resources (pro- 
duce an excessive amount) because part of the cost of the operation is 
external to the firm—it is borne by others. Notoriously rare is the firm 
which refrains from expanding its operations because it will increase the 
pollution of the atmosphere or because it leads to soil erosion or the 
depletion of a natural resource (such as fish in the sea) which it does not 
own. 

To summarize, 


Proposition 7: Externalities can lead to a misallocation of resources 
even in the world of perfect competition. Too little may be produced by 
industries in which external benefits prevail, while there may be more than 
an optimal output of commodities whose production involves detrimental 
externalities. 


In principle, it is even possible that where there are detrimental ex- 
ternalities the presence of monopolies can lead to outputs smaller, and 
therefore more nearly optimal, than those which would result from 
competition.”* 


21 Moreover, if any industry is taken over by a monopoly, external economies or dis- 
economies may sometimes disappear in the process. If the expansion of one firm saves 
money for another firm in the same industry, this external economy becomes internal 
when the two firms combine into one—all the benefits accrue to the same management. 
This also illustrates the danger of decentralized decision-making in a single company— 
the manager of one division will not always pay adequate attention to the effects of his 
decisions on other divisions in the company (the effects of his decisions which are ex- 
ternal to his division). As a result, a set of decisions which are optimal for the branches 
of the company, taken by themselves, may be far from optimal for the company as à 
whole. Cf. Charles Hiteh and Roland McKean, "Suboptimization in Operations Prob- 
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12. “Market Failure” and Public Goods 


We have seen that the market mechanism will fail to produce ideal 
results as the result of at least three influences: monopolistic elements, 
economies of scale, and externalities. A fourth source of market failure 
closely associated with the phenomenon of externalities is also important. 
This is the category of public goods, for which we have the 


Definition: A pure public good is one which can serve a small or a large 
number of persons at exactly the same total cost (the marginal cost of an 
additional user is zero). This characteristic is called supply jointness or 
undepletability. In addition, public goods are often taken to be charac- 
terized by the impossibility of exclusion of anyone from enjoying its 
benefits once the good has been provided. 


Standard examples of public goods are television broadcasts which 
cost the same whether they have 50 or 5 million viewers and improvement 
of the quality of the atmosphere since reduced smoke emissions from 
factories cost the same whether 50 or 5 million nearby residents benefit 
from the cleaner air. Note that exclusion is virtually out of the question in 
the second case because you cannot stop anyone from breathing, at least 
not without committing a felony, while in the broadcasting case exclusion 
is technically possible—a scrambling device can prevent someone from 
receiving a broadcast without prior payment. 

It is often difficult or impossible for private enterprise to supply a 
public good profitably, particularly where exclusion is not possible, since 
if no one can be prevented from using the good, no one can be forced to pay 
for it. Moreover, from the point of view of consumption decisions, the 
Pareto optimal price of a pure public good is zero once it has been pro- 
duced,?? for since it costs society nothing if another person uses the good, 
there is a net opportunity loss in inducing anyone to refrain from con- 
suming it because of a price charged for the item. This does not mean 
that it is desirable socially to produce every public good but merely that 
once produced their use should be opened to as many people as want to 
take advantage of them even if the benefit they obtain is very small. 

To decide whether a public good should be produced, we must add 
together the benefits to all the people who will consume it and see whether 


lems,” in Joseph F. McCloskey and Florence N. Trefethen (eds.), Operations Research 
for Management, The Johns Hopkins Press, Baltimore, 1954. See also Charles Hitch, 
“Economics and Military Operations Research," Review of Economics and Statistics, 
Vol. XL, August 1958. 

22 That is, it is zero unless there is a budget constraint to be met. Where such a 
constraint is present the Ramsey theorem of Section 10 applies. 
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they are at least equal to the cost. We can measure individual j’s marginal 
benefits from publie good X in terms of the money j is willing to give up 
for it, that is, in terms of j’s marginal rate of substitution between X and 
money (commodity N), mui/mus. Then the sum of the marginal benefits 
to all m individuals in the community will be mul/mul + mu2/mu2 + +--+ 
mu? /mun. 


Proposition 8: At an optimal output of public good X its social marginal 
benefit must (for the usual reasons) be equal to its marginal cost, that is, 


mul/mul +- - -+ muZ/mu; = mez. 


It is instructive to compare this condition for optimal output of a 
publie good with the corresponding requirement for a private good given 
in Proposition 3. 

In any event, it is clear that desirable public goods will often not be 
produced or at least not produced in adequate amounts if their supply is 
left to private enterprise. Typically, public goods from national defense to 
improvement of the environment require either governmental intervention 
or direct supply by the public sector. 


13. Some Additional Results of Welfare Theory 


By way of illustration of the results obtained with the-aid of welfare 
theory, let us now summarize briefly three standard theorems of welfare 
economies. Each of them can be criticized in a number of respects, but each 
result, if taken with a grain of salt, yields some useful policy insights. 


Proposition 9: In a world of pure competition a tariff must result in a 
misallocation of resources and in a reduction in net social welfare when all 
affected nations are considered together. 


This result is almost a direct consequence of the results of Section 8. If 
the allocation of resources which results from competitive prices is optimal, 
any reallocation of resources which results from the imposition of a tariff 
and the consequent modification of prices must be presumed to reduce 
welfare in all countries as a group, though the tariff-levying country may 
conceivably gain at the expense of others. Specifically, a tariff reduces 
welfare by causing relative prices to be different for importers and ex- 
porters, thus violating the optimal price system requirements given in 
Section 7. 

Before we go on to the next proposition the reader should be reminded 
that points rationing is a method of distributing limited supplies by pro- 
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viding each consumer with some fixed number of ration “points,” say 250 
points tokens, and requiring that he give up 5 points for every pound of 
beef he buys, one point with every pound of sugar, etc. In this way the 
consumer still retains a fair amount of discretion in the choice of his pur- 
chases even though his total consumption is restricted. By making its 
“point price" (the number of points given up with a purchase) sufficiently 
high, consumption of any item can clearly be cut down as far as is appro- 
priate. We can now formulate 


Proposition 10: Wherever it is necessary to restrict the use of a number 
of commodities, it is better to do so by means of a system of point rationing, 
in which each consumer is assigned an equal number of points to be used 
by him as he prefers, rather than the more usual method of assigning an 
equal amount of each good to each consumer. 


Thus, if each consumer is, for example, assigned 1 pound of beef and 
1 pound of lamb per week, in an ordinary rationing procedure, the result 
is necessarily disadvantageous both to consumers who prefer beef and to 
those who prefer lamb. If, on the other hand, each consumer is given 10 
points and told that he must give up 5 points with every pound of meat 
he purchases, beef lovers will get their 2 pounds of beef and lamb eaters 
their 2 pounds of lamb. 

Indeed, if points are really so effective in restrieting consumption that 
money income is no longer a real limit of consumption, the ration points, 
rather than the money prices, will become the relevant prices in the pur- 
chases of these rationed commodities. Then each rationed consumer will 
purchase commodities in such proportions that the marginal rate of substi- 
tution of any one commodity for another will be equal to the ratio of their 
fixed point prices. Hence, under points rationing, these marginal rates of 
substitution must be the same for all consumers, as Proposition 1 requires, 
whereas the rule will almost certainly be violated under a more rigid 
rationing system. 

Points rationing may only be undesirable for one of two reasons: 
(a) If some consumers are considered irresponsible—say, if they will spend 
points on gin rather than orange juice for their children (or vice versa, 
depending on the point of view)-—it may then be preferable to issue a fixed 
orange juice (gin) ration which cannot be traded for anything else. (b) If 
money prices are permitted to rise so much that the poor cannot afford to 
use up all their ration points, fixed rations may be desired for morale 
purposes, to keep the wealthy from getting too large a share of the available 
produce in times of scarcity. 

This theorem on rationing illustrates how the relatively innocuous 
Proposition 1, which requires equality of marginal rates of substitution for 
all consumers, can help lead to significant policy conclusions. 
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(alleged) Proposition 11: If the government decides to obtain some 
fixed amount of money by means of taxation, it is better to do so by income 
taxation rather than by excise (sales) taxation. 


The central idea of the argument is that a taxpayer who has K dollars 
taken from him by means of an income tax can, if he wishes, always afford 
to purchase the combination of goods which is the most preferred of the 
combinations available to him if he had ezactly the same amount taken from 
him by a sales tax on some single commodity. It follows at once that if the 
man pays the income tax he must be at least as well off as he would be after 
the sales-tax payment, for the income-tax payer has available the best of 
the options which would be open to him as a sales-tax payer (as well as 
some options which would then not be available). 

Consider a consumer who has an income of $100 and buys 3 units of 
commodity X at $10 each (Table 1). If, say, a tax of $1 on each unit of X 
Sold leads the buyer to reduce his purchase of X down to U units (his 
preferred purchase of X at this new price), a $1 excise tax on X will clearly 
yield the government U dollars in sales-tax money from this person 
(second line of the table).?? Since the consumer now spends $11 (including 
$1 tax) on each of U units of X, he will be left with 100 — 11U dollars to 
spend on his other purchases. 


TABLE 1 
M — —— ee eee 


Income Left to 


Income Price of Units of Spend on Items 
Income After Tax X X Bought Other than X 

C me E Uo oiu om 
Before tax 100 10 3 70 
Excise tax 100 11 U 100 — 11U 
Income tax 100 100— U 10 U 100 — U — 10U 

(possible) 
Income tax 100 100— U 10 ? ? 

(actual) 


Suppose, on the other hand, that the government were simply to collect 
an equal amount, U dollars, out of the income of the consumer instead of 
setting up a sales tax. The consumer could, if he wishes (third line of the 
table), still buy U units of X (now at $10 each) and be left with 100 minus 
the U dollars tax minus his 10U dollars outlay on X, again leaving him 


23 More generally, if with a sales tax the person would buy z units of X at a price 
of P dollars and the government wishes to collect K dollars, it must levy a tax K/z 
dollars on each unit of X sold, so the consumer ends up with 100 — pz — K dollars. 
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100 — 11U dollars in total to spend on other commodities. Thus, this 
hypothetical allocation of his money after payment of an income tax 
would, if he chose it, leave him exactly as well off as the sales tax, because 
in either case he ends up with exactly U units of X and 100 — 11U dollars 
to spend on other items. But, in fact, the income-tax-paying consumer may 
not wish to do exactly what he would have done in the presence of a sales 
tax. He will, very likely, spend the 100 — U he has left after income taxes 
in some way other than buying U units of X (fourth line of the table). If 
so, he presumably adopts this alternative because he prefers the situation 
in line 4 to that in line 3. But since line 3 is identical with line 2, the best 
of the alternatives available to him under a sales tax on commodity X, he 
must prefer his state after payment of income taz (line 4) to his situation after 
payment of the sales tax. Any other numbers or algebraic symbols can be 
substituted in the table and the reader can see they lead to the same 
results—the income tax generally hurts the consumer less than does the 
excise tax. 

In intuitive terms, what is the logic of this result? It is simply that an 
excise tax distorts prices from their optimal levels and forces the consumer 
to reallocate his expenditures among commodities in a less desirable manner. 
An income tax reduces the consumer’s over-all purchasing power but does 
not directly change relative prices and so does not force him to redirect 
his expenditures. 

The validity of the argument has been questioned on a number of 
grounds. First of all, an income tax also motivates the individual to change 
his pattern of behavior. It distorts his income-earning plans rather than 
his consumption pattern. It can lead him to work either harder or less 
hard than he would have if no tax had been imposed on his earnings. 
This does not show up in Table 1, which assumes that the consumer’s 
total income before taxes is fixed at $100. It is true, of course, that an excise 
tax also reduces the consumer's real income so that we may still argue that 
the sales tax distorts consumer behavior in two ways, whereas an income 
tax does so in only one. But it is easy to show the important 


Proposition 12. The theorem of second best?*: It is not necessarily worse 
for society if a large number of optimality conditions are violated than if 
only a few are violated. 


We have seen, for example, that the presence of only one monopoly 
firm which violates the optimal-pricing conditions may be worse than 
having many firms that do so (Section 8) and that the monopolist’s output 


24 See Richard Lipsey and Kevin Lancaster, “The General Theory of Second Best,” 
Review of Economic Studies, Vol. 24, December 1956. d 
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restriction which is a violation of the requirements of Pareto optimality 
may help to offset the violation of the optimality conditions resulting from 
detrimental externalities (Section 11). The second-best theorem is a some- 
what unfortunate result from the point of view of policy applications of 
welfare theory for it tells us that piecemeal elimination of violations of the 
optimality conditions is not necessarily beneficial. 


14, Criteria for Welfare Judgments 


As mentioned earlier, a decade ago there was much discussion of the 
circumstances under which the economist is entitled to make any welfare 
pronouncements—when he can say that policy A will, in some sense, in- 
crease the welfare of the community as a whole. This problem lies at the 
foundations of welfare economics, for unless the economist knows how to 
distinguish between a policy change which is an improvement and one 
which makes things worse, he is in no position to make any recommenda- 
tions at all. 

It is my opinion, however, that the protracted discussion of this issue 
was neither very necessary nor very illuminating. I believe that there is a 
wide variety of policy recommendations which economists have long made 
and can continue to make with clear consciences. On many issues the de- 
sires of the community are rather obvious. For example, during the Great 
Depression it required little justification for the profession to adopt the 
reduction of unemployment as a prime objective. Similarly, in an impover- 
ished country, an increase in per capita income can surely be assigned a 
high priority. Even where the situation is not so clear-cut, the economist 
has enough to say on policy matters by sticking to questions of the means 
appropriate for the achievement of given ends. For example, in a country 
in which the government is seeking to build up gold and dollar reserves, 
the economist is clearly the person who must consider whether a devalua- 
tion will make things better or worse in this respect. 

In any event, I believe that the discussion of welfare criteria was 
relatively sterile and was largely foredoomed to failure. Any attempt to 
construct a rigorous and universally applicable criterion for distinguishing 
what policy change is an economic improvement must founder on the 
problem of interpersonal comparisons. Where a policy change affects some 
persons favorably and others adversely, as is usually the case, there is no 
a priori way of weighing the net result. Of course, we can and must make 
interpersonal comparisons—we judge, reasonably, that flood victims must 
occupy the government’s attention and receive emergency assistance at the 
cost of other taxpayers. We decide that the building of a hospital will serve 
the general welfare even if it is inconvenient for a few homeowners who 
were located at its site, and so on. However, these judgments must be 
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rough and ready and can only be handled case by case. No abstract and 
general formula can be invented which handles all such problems satis- 
factorily. 

It is nevertheless worth examining several of the general criteria which 
were proposed for this purpose—criteria designed to test whether or not a 
proposed policy change is an improvement. 

1. The Pareto criterion. We have already discussed this first criterion, 
originally formulated by the Italian Vilfredo Pareto about half a century 
ago, which states simply that i 


Any change which harms no one and which makes some people better 
off (in their own estimation) must be considered to be an improvement. 


This statement is certainly persuasive, and it is less empty than may 
at first appear. Some rather striking analytical results can be obtained 
with its help. For example, the argument of Section 13, that points ration- 
ing is ordinarily better than fixed ration quantities, relies on no more than 
the Pareto criterion. Points rationing may permit every consumer to benefit 
by adjusting his purchases in accord with his own tastes and desires, and no 
one need be harmed by it. For a similar reason, Proposition 1 of Section 4 
can be considered to be founded upon the Pareto criterion. 

Unfortunately, there are many policy proposals which cannot be judged 
with the aid of this criterion. The Pareto criterion does not apply to any 
proposal which will benefit some and harm others. In other words, the 
Pareto criterion works by sidestepping the crucial issue of interpersonal 
comparison and income distribution, that is, by dealing only with cases 
where no one is harmed so that the problem does not arise. 

To compare it with the other criteria, it is convenient to translate the 
Pareto criterion into graphic terms. For simplicity, let us deal with a 
community in which there are only two persons, X and Y. In Figure 4a, 
let us represent the utility of individual X along the horizontal axis and 
that of Y along the vertical axis.?? 

The Pareto criterion then states that if we start off from a situation 
which is represented by a point like A, then a policy change is an improve- 
ment if it results in a move to any point like B, C, or D which lies to the 
right of A, or above A, or above and to the right of A, for at B, X is better 
off than at A with Y as well off as before, whereas the move to C benefits 
Y without harming X, and the move to D benefits both persons. 


25 For this purpose it does not matter how we measure this utility. The utility scales 
of the two individuals need not be comparable. All that is required for our purposes is 
that a movement toward the right in the graph, say from point A to B, always corre- 
sponds to some (unspecified) increase in X’s welfare and that an upward movement 
represents an improvement in Y's well-being. 


528 General Equilibrium and Welfare Economics Chapter 21 


XS uTILiTy P/ X's unum 


(a) (b) 


Figure 4 


However, a move from A to E cannot be evaluated on the basis of the 
Pareto criterion, for this change increases Y's welfare but it does so at X's 
expense. 

2. The Kaldor criterion. In order to permit the economist to pass judg- 
ment on a move such as that from A to E, Kaldor proposed the following 
criterion.?® Suppose we ask individual Y how much he would pay (the 
maximum amount) rather than forego the move from A to E, and call this 
amount K,. Similarly ask X how much he is willing to pay to prevent this 
change (call the amount K- dollars). Then if K, exceeds K+, Kaldor argued 
that Y could compensate X for his loss in welfare and yet keep some part 
of the gain for himself. In other words, the change is a net gain, on balance, 
according to Kaldor, because, at least in money terms, the gain to Y 
outweighs the loss to X. Note that Kaldor does not require that X actually 


26 A very similar criterion was formulated by Hicks, and, later, Little drew on these 
to put forward a more guarded criterion of his own. 
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be compensated so that no one would end up with a loss. Such a change 
with compensation would be an improvement even under the Pareto 
criterion. Kaldor merely requires that the gainer be able, potentially, to 
make this compensation out of his gains. The Kaldor criterion then states 
that 


A change is an improvement if those who gain evaluate their gains at 
a higher figure than the value which the losers set upon their losses. 


To translate this into graphic terms we must introduce a device called 
the utility possibility curve (PP’ in Figure 4b). Suppose we start off at 
point F and consider what happens if X gives up some of his wealth and 
presents it to Y. This might result in the move to point G where X is worse 
off and Y better off than at F. Still another such redistribution of wealth 
might move us to E and so on. Thus, PP' is the locus of all combinations 
of X's and Y's utility levels which can be achieved by a redistribution of 
wealth between individuals X and Y and where this redistribution is 
accompanied by no other change. 

Consider now the change from point A to E, which, we have seen, 
cannot be evaluated by means of the Pareto criterion because it involves a 
gain for Y but a loss for X. PP' is the utility possibility curve through 
point E. But there are points such as F and G which can be attained from E 
by a redistribution of wealth and which lie above and/or to the right of A. 
On the Kaldor criterion, then, the move from A to E is in this case an 
improvement because it is possible to redistribute wealth at E in such a way 
that no one loses as a result of the change. At G and certainly at F, X has 
been compensated for his loss. We conclude that, on the Kaldor criterion, 
any move from a point A to a point E is an improvement if and only if A 
lies underneath the utility possibility curve through point E. 

3. The Scitovsky double criterion. Scitovsky soon pointed out that the 
Kaldor criterion suffers from a serious weakness. Suppose there is an 
economie change that not only affects the utility of each individual (the 
change from A to H in Figure 4c) but that simultaneously shifts the utility 
possibility locus. The issue is whether the entire change is a good thing. 
It is possible, on Kaldor's criterion, that a move from A to H will be 
considered an improvement but that, at the same time, the return from H 
back to A will be an improvement as well! This is shown in Figure 4c, 
where A lies below the utility possibility curve RR’ through H, but at the 
same time H lies below SS’, the utility possibility curve through A! It will 
be observed that this odd situation occurs as a result of the intersection of 
the two utility possibility curves in Figure 4c. 
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To avoid this embarrassing possibility, Scitovsky proposed a stricter 
test involving two parts: i 


(a) Use the Kaldor criterion to see if the move from the initial 
point to the new point is an improvement. 

(b) Use the Kaldor criterion to make sure that the return move 
from the new point back to the initial point is not an improvement. 
On this criterion, if and only if the move passes both parts of the 
double test is the move an improvement, according to Scitovsky. 


In my yiew, both the Kaldor and the Scitovsky tests operate on the 
basis of an implicit and unacceptable value judgment. By using a criterion 
involving potential money compensation, they set up a concealed inter- 
personal comparison on a money basis. If Y's gain is worth $200 to him 
whereas X evaluates his loss at $70, we are not entitled to jump to the 
conclusion that there is a net gain in the move from A to E in Figure 4a. If 
X is a poor man or a miser, $70 may mean a great deal to him, whereas if Y 
is a rich man or a profligate, the $200 may represent a trifle hardly worth 
his notice. Thus, unless X is actually compensated for his loss (in which 
case the Kaldor criterion is unnecessary—and the Pareto criterion can do 
the job) the change from A to E may represent a major loss to X and a 
trivial gain to Y even if it passes the Kaldor criterion with flying colors. 

The Kaldor and Scitovsky criteria have thus ducked the basic problem 
—the interpersonal comparison required to evaluate a policy change which 
harms X but aids Y. They duck it by saying, implicitly, that the economist’s 
recommendation should be based on X’s and Y’s relative willingness and 
ability to pay for what they want. They accept the status quo distribution 
as a measure of the relative strength of feeling of the two individuals. 

It is no answer to this criticism to say that these criteria are just 
designed to measure whether production, and hence potential welfare, are 
increased by a policy change—that these criteria disentangle the evaluation 
of a production change from that of the distribution change by which it is 
accompanied. Consider a change in production which increases gin output 
but reduces the output of whiskey. If X likes highballs but Y prefers 
martinis, the question whether this is an increase in production is inex- 
tricably tied in with the distribution of these beverages between X and Y. 

Even if the utility possibility curves never intersect (Figure 4d), the 
same problem can arise. At point J, X is better off but Y is worse off than 
at A. Thus, even though the Kaldor and the Scitovsky criteria both tell us 
that J is better than A because J’s utility possibility curve lies above A 
(but not vice versa), it is not at all clear that we are entitled to this 
conclusion. 

4. The Bergson criterion. A final criterion to be described here is due to 
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Bergson. He suggests, reasonably, that the only way out of the problem is 
the formulation of a set of explicit value judgments which enable the 
analyst to evaluate the situation. These judgments as to what constitutes 
justice and virtue in distribution may be those of the economist himself, or 
those set up by the legislature, by some other govemmental authority, or 
by some other unspecified person or group. 

In effect, this amounts to the construction of an indifference map 
ranking different combinations of the utility which may- accrue to the 
various members of society (the broken lines in Figure 4a). Such an 
indifference map is called the social welfare function, and it does permit the 
analyst to judge definitively whether or not a proposed policy change is an 
improvement. Thus, in Figure 4a, E must be considered better than A 
(the change from A to E is an improvement) because E lies on a higher 
indifference curve of that social welfare function. 

Essentially, the Bergson criterion must be judged right, if not very 
helpful. To decide whether E is better than A, we must certainly employ 
some value judgments, and unless these judgments are explicit they must 
be treated with suspicion. Implicit value judgments only too often are at 
variance even with the intentions of those who make them, as would seem 
to be the case with the Kaldor and Scitovsky criteria. 

But the Bergson criterion, though it provides us with a highly useful 
frame of reference, unfortunately does not come equipped with a kit and 
a set of instructions for collecting the welfare judgments which it requires. 
Thus, it still leaves us with the difficult part of the job unsolved. At any 
rate, it is not advisable to approach the problem as one noted economist is 
supposed to have done—by confronting the chief executive of a large 
underdeveloped country and saying to him, “Please describe your social 
welfare function to me.” 


15. A Theorem on Democratic Group Decisions 


Several economists have recently devoted considerable attention to the 
relationship between individual and group decisions. That is, given 
information about the desires of the various persons who make up the 
group, the problem is that of setting up reasonable procedures for the 
reconciliation of those desires into a group decision. It will be noted that 
this problem has a family resemblance to that of the preceding section. 

The discussion of this subject stems largely from the work of Kenneth 
J. Arrow.?? His procedure is to list some plausible acceptability criteria 


?7 See, particularly, his Social Choice and Individual Values, Cowles Commission 
Monograph No. 12, John Wiley & Sons, Inc., New York, 1951. See also Duncan Black 
and R. A. Newing, Committee Decisions with Complementary Valuation, Hodge, London, 
1951, and the references to Black's work in Arrow, op. cit. 
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for social decisions and to examine their implications. He originally pro- 
posed the following four minimal conditions which social choices must meet 
in order to reflect individuals’ preferences: (1) Social choices must be 
consistent (transitive) in the sense that if A will be decided in preference 
to B, and B in preference to C, then C will not be decided in preference to 
A; (2) the group decisions must not be dictated by anyone outside the 
community or by any one individual in the community; (3) social choices 
must not change in the opposite direction from the choices of the members 
of that society; that is, an alternative which would otherwise have been 
chosen by society must never be rejected just because some individuals 
come to regard it more favorably; and (4) a social decision as between two 
alternatives must not change so long as no individual in the community 
changes the order in which he ranks these alternatives in accord with his 
preferences. In other words, the social preference as between two alterna- 
tives, A and B, must depend only on people’s opinions of just these two 
alternatives, A and B (and not on any other alternative which does not 
happen to be immediately relevant). 

At first glance, these requirements for social choice may seem a rather 
appropriate set of conditions ; + democratic decision-making. However, 
Arrow has shown that the matter is not so simple. He has demonstrated 
that it is impossible to choose among all possible sets of alternatives without 
violating at least one of his four criteria. In other words, it would appear 
that social choice must be in a sense inconsistent or undemocratic! This 
negative result is the central theorem of Arrow's book. 

Let us illustrate how such difficulties can arise. The obvious and most 
standard procedure for reaching group decisions is the ballot. But it has 
long been known that the voting: procedure runs afoul of Arrow's first 
requirement. That is, majority rule can lead to a pattern of social choices 
which is not transitive even though every voter has transitive preferences. 
This can be illustrated by an example. Three individuals, Smith, Jones, 
and Mznch, are to vote among three alternatives, A, B, and C, by writing à 
“3” next to the alternative they like most, a “2” beside the one they rank 
next highest, etc. Suppose, then, we get the following record of this 
balloting: 
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A glance at the table shows that both Smith and Mznch prefer A to B, 
that Smith and Jones both prefer B to C, and that Jones and Mznch prefer 
C to A. Hence, the majority prefers A to B and B to C, but it also prefers 
C to A! We see then that majority voting can easily lead to intransitive 
social choice patterns. 

Later examination of the problem has suggested, however, that Arrow's 
requirements are more strict than they seem at first view and that incon- 
sistent or “undemocratic” social choice-making is not really the only 
alternative.?? The difficulty pointed out by Arrow’s research can be 
ascribed, in part, to the fact that the fourth condition, above, is consider- 
ably more restrictive than first appears and is not merely the postulate of 
popular sovereignty that it seems. First, it implies that in deciding as 
between two alternatives the public's preferences as among still other 
alternatives be treated as irrelevant.?? Suppose, for example, that half the 


28 See Clifford Hildreth, “Alternative Conditions for Social Orderings,’’ Econometrica, 
Vol. XXI, January 1953; and Leo A. Goodman and Harry Markowitz, “Social Welfare 
Functions Based on Individual Rankings,’’ American Journal of Sociology, Vol. LVIII, 
November 1952. For a more technical criticism of Arrow's argument, see Julian H. 
Blau, “The Existence of Sociat Welfare Functions," Econometrica, Vol. 25, April 1957. 

29 As an example of the intent of this assumption of “independence of irrelevant 
alternatives" consider the balloting described in the following two tables which violate 


the premise: 


Smith Smith 
Jones Jones 
Mznch Mznch 
Total point vote — 10 7 8 5 Total point vote 7 7 4 


In the left-hand table A wins by 10 points to 8 points for C. But if only irrelevant 
alternative B is dropped from consideration (the right-hand table), A and C become tied. 
In effect, the assumption states that the decision of a third party to put up a candidate 
who stands no chance of winning himself should not affect the outcome of the election 
a8 between the Democratic and Republican candidates. 

However, it may be questioned whether it is really socially desirable to exclude such 
"irrelevant alternatives" from consideration. A weaker third party, such as the Liberal 
Party in Great Britain, derives much of whatever power it possesses from the possibility 
that its decision to run candidates may affect the outcome of an election. Elimination 
of this sort of influence may materially weaken the protection which the political system 
affords to the "irrelevant" minority groups. One may suspect that the popularity among 
mathematical economists of the axiom of "independence of irrelevant alternatives” 
stems as much from its spectacular consequences as from its attractiveness as a political 
tenet. 
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public prefers the erection of a bridge to the digging of a tunnel under a 
river at comparable cost, while the other half ranks the projects the other 
way. The fourth condition requires that the government’s decision be 
uninfluenced by the fact that the tunnel advocates feel this to be the most 
important of the public works projects currently under discussion, while 
those who want the bridge really think that almost any other project is of 
greater significance. 

In addition, Arrow’s fourth condition requires that only rankings be 
considered. This means that no weight be given to the intensity of desires. 
For example, if 50 per cent of the public demands the tunnel with consider- 
able emotion because it feels that the bridge will deface the beauty of the 
area, while the other half of the public has a slight preference for the bridge 
because of its slightly lower cost, the difference in intensity of these 
preferences must, on this fourth condition, be disregarded. 

Of course, we do not know how to measure intensity of feeling. Still 
there are cases where there would be consensus on this and where, in fact, 
a choice in accord with the Arrow condition would be unacceptable in 
principle to most of us. For example, in deciding whether to allocate labor 
to the production of some drug needed to treat a rare but dangerous disease 
or to the manufacture of Scrabble sets, we may recognize that, compassion 
aside, more people will want the Scrabble sets than the medicine. Yet on 
the crudest sort of interpersonal comparison of benefits, we may decide 
that the public as a whole will gain more from the production of the 
medicine because its potential users feel much more strongly about their 
preference than do the others. 

However we may feel about the outcome of this discussion, it must be 
agreed that Arrow has again called our attention to the presence of pitfalls 
and treacherous problems in the analysis of group decision-making. More- 
over, although the reader is given no hint of their flavor here, Arrow has 
made a very important contribution in his choice of mathematical tools, 
for he has shown that the subject matter lends itself well to the methods 
of symbolic logic, and by means of this demonstration he has made what 
may well prove to be a significant addition to the economist’s stock of 
useful analytic equipment. 


16. Concluding Remarks 


In this chapter we have seen that welfare economics has run the gamut 
from specific (though abstractly derived) policy conclusions on particular 
issues like rationing to broad, rather philosophical investigations into the 
proper foundations for the entire area of investigation. More recently, 
welfare economics seems to have gone off into a relatively new direction. 
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It has been used in operations-research type of analysis of specific problems 
of government?? and as training material for operations researchers who 
can learn from the special concepts of welfare economies to avoid some 
frequently encountered analytic booby traps. For example, the idea of 
external economies and diseconomies has taught us to beware of policies 
which yield optimal results for each of the various divisions of a firm 
taken by themselves, because by not taking into account the effects of its 
decisions on the rest of the company, policy-making, division by division, 
may yield results which are far from optimal for the company as a whole. 
We sce, then, that the emphasis in welfare economics has swung from its 
rather abstract subject matter in the 1940s toward the other extreme— 
to very applied work and concrete problems of day-to-day economic 
decision-making. 
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Input-Output Analysis 
22 


1. The Economic Problem and the Assumptions 


Input-output analysis, for which we are indebted to Professor Leontief, 
is the name given to the attempt to take account of general equilibrium 
phenomena in the empirical analysis of production. The three italicized 
elements in this statement are crucial and merit further discussion. Re- 
versing their order, we observe, first, that the analysis deals almost exclu- 
sively with production. Demand theory plays no role in the hard core of 
input-output analysis.! The problem is essentially technological. The in- 
vestigation seeks to determine what can be produced, and the quantity of 
each intermediate product which must be used up in the production process, 
given the quantities of available resources and the state of technology. 

The second distinctive feature of input-output analysis is its devotion to 
empirical investigation. This is primarily what distinguishes it from the 
work of Walras and later general equilibrium theorists. A consequence of 
this no doubt long-overdue concern with the facts is that compromises have 
been forced on the investigator. Input-output employs a model which is 


1 This is strictly true only of the open model which is described here. In this model 
the final demand sector is, in effect, taken to be outside the production economy and 
final products are "exported" to the consumer inhabitants of this "foreign" demand 
sector. There is, however, a closed model in which labor is treated as a produced com- 
modity and consumption as the raw materials used up in the production of labor. Here 
at least some rudimentary demand analysis must enter to show how the levels of con- 
sumption demands are related to the levels of labor outputs supplied. 
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more severely simplified and also more narrow in the sense that it seeks to 
encompass fewer phenomena than does the usual general equilibrium 
theory. Its narrowness lies in its exclusive emphasis of the production side 
of the economy. Its oversimplifications I shall discuss presently. 

The third distinctive feature is its emphasis of general equilibrium? 
phenomena. Input-output seeks to take account of the interdependence of 
the production plans and activities of the many industries which constitute 
an economy. This interdependence arises out of the fact that each industry 
employs the outputs of other industries as its raw materials. Its output, in 
turn, is often used by other producers as a productive factor, sometimes by 
those very industries from which it obtained its ingredients. Steel is used to 
make railroad cars and railroad cars are, in turn, used to transport steel and 
the coal and pig iron which are used in its manufacture. Other examples 
should come to mind at once. 

The basic problem, then, is to see what can be left over for final con- 
sumption (consumer, military, etc.) and how much of each output will be 
used up in the course of the productive activities which must be under- 
taken to obtain these net outputs. It should be clear that a successful 
attack upon these problems can result in an abundance of applications. It 
can be used in predicting future production requirements if usable demand 
estimates can somehow be obtained. Particularly, it can be used for econo- 
mic planning including problems of economic development in “backward 
areas” and problems of military mobilization. A more modest purpose 
which it has already successfully begun to serve is the provision of a very 
illuminating detailed structure for national income accounting. 

As we stated earlier, the intransigence of the empirical materials and 
the computational problems have forced on input-output analysis a number 
of simplifying assumptions even more extreme than those usually employed 
in our theoretical models. Particularly noteworthy are two assumptions, 
each of which has to some extent been relaxed in practice. One assumption, 
which will not be discussed, states that no two commodities are produced 
jointly. Each industry produces only one homogeneous output. But this 
restriction can be somewhat relaxed by interpreting this good as a com- 
posite commodity which is made up of several items produced in fixed 
proportions. Such a compound good can, for example, consist in packages 
of chewing gum and fertilizer in which there are always ten sticks of gum 
and one pound of fertilizer. 

Perhaps more serious is à second assumption which states that in any 


2 The term “equilibrium” is misleading here. The outputs found by this method 
need not satisfy market equilibrium conditions. The analysis qualifies for the title 
“general equilibrium” in that it takes account of the interdependence of the various 
sectors of the economy. Perhaps we can say, more properly, that the model is charac- 
terized by the "general" without the equilibrium. 
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productive process all inputs are employed in rigidly fixed proportions and 
the use of these inputs expands in proportion with the level of output. This 
is a special case of an assumption of constant returns to scale (see Chapter 
11, Section 5). But the fixed-proportions assumption is far more restrictive. 
Constant returns to scale is perfectly consistent with the substitution of 
one factor for another. A linear homogeneous production function (con- 
stant return to scale) permits both labor-intensive and capital-intensive 
processes. The firm whose production function exhibits constant returns can 
if it wishes have one hundred workers for every $1,000 invested in ma- 
chinery, or it may use machines which require only ten workers per $1,000 
machine investment. A linear homogeneous production function requires 
only that if the firm decides to triple the scale of either of these types of 
operation, the result will be a tripling of output. Not so the Leontief fixed- 
proportions premise, which requires that a manufacturing process which is 
labor intensive offer no option of a capital-intensive alternative.® If fifty- 
three men per $1,000 of investment are required at any level of operation, 
it is assumed that the same ratio will be required no matter how much the 
size of the firm expands or contracts. Whether this assumption is relatively 
innocuous or does considerable violence to the input-output results is still 
under dispute. But the premise is certainly never absolutely true, even in 
those cases where chemistry and engineering dictate fixed proportions 
between some ingredient and output. 


2. The Mathematics 


Basically, the input-output analysis consists in nothing more compli- 
cated than the solution of a set of N simultaneous linear equations in N 
variables. To illustrate this, let us consider a three-industry economy 
which produces coal, steel, and the service of railroad transportation. Each 
of these is measured in dollar terms. Each of these industries employs the 
products of the others in its manufacture, say in proportions shown by the 
following table: 


User of Output 
Steel Coal R.R. 


Steel 
Producer Coal 
of R.R 


Total 


3 But cf. the Samuelson substitution theorem described in Section 4 of this chapter. 
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For example, the first column of the table states that every dollar's 
worth of steel uses in its manufacture 20 cents in steel, 40 cents in coal, 20 
cents in railroad transportation, and 20 cents in labor. 

Suppose, now, that somehow there have been set consumer output 
targets of $100 million in steel, $20 million in coal, and $40 million in rail- 
road transportation. How much of each of these goods will have to be 
manufactured for both consumer and industrial use to meet the final out- 
put goals? Let S, C, and R represent the dollar value of this total output of 
steel, coal, and railroad transportation, respectively. Let us first examine 
the demands on the steel industry: In addition to the 100 demanded by 
final consumers, there will be the demand for its product for internal use 
which (the table tells us) amounts to 2/10 of the total steel output or 0.25. 
Similarly, the railroad industry will require 1/10 of a dollar of steel for 
every dollar of its service, so that the total railroading demand for steel 
will be 0.1, etc. Thus we have the equation 


(total steel output) equals (amount used in steel mfg.) 
S = 0.28 
plus (use in coal mfg.) plus (R.R. use) plus 
T 0.2C T 0.1R + 
(amount left over for consumption) 
100 
or S = 0.28 + 0.2C + 0.1R + 100. 


Similarly, we have the following two equations giving the amounts of coal 
and rail transportation available for final consumption: 


C = 0.48 + 0.1C + 0.3R + 20 
and R = 0.28 + 0.5C + 0.1R + 40. 


These are three simultaneous linear equations in the three unknowns, S, C, 
and R. If we solve the equations for the values of these variables, we find 
what we started out to seek—the total outputs of the three commodities 
needed to meet the stated consumer targets. Only one more step is required. 
We note from the input-output table that $0.2 of labor time are consumed 
in the manufacture of $1 of steel, so that 0.2S dollars of labor will be needed 
to produce the required S dollars of steel production. Continuing in this 
way we see that 0.28 + 0.2C + 0.5R dollars worth of labor will be needed 
to produce the outputs of the three commodities required by our program. 
Taking the price of labor to be fixed, we see that this involves a specific 
requirement of labor man-hours. If this computed number does not exceed 
the available supply, all is well—the targets are feasible. Otherwise more 
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modest targets must be substituted. That is the core of the theory of input- 
output. 

We can see now why it is so convenient to work with fixed coefficients of 
production. With variable input proportions, single numbers will not suffice 
in the input-output table. Instead we would have to deduce, from the avail- 
able statistics and engineering information, functional relationships be- 
tween the level of output of each industry and the quantity of each input 
which would be required to produce it. The enormous statistical problems 
should be obvious enough. It is equally clear that the relevant equations 
would be complicated enormously. Even with the huge economy effected by 
the fixed-coefficients premise, the statistical and computational difficulties 
are tremendous. We can see that the first three rows of our table contain 
nine figures, the three inputs required by each of the three industries. Simi- 
larly, a four-industry model would require more than 16 figures, and so on. 
The number of required pieces of statistical information incieases as the 
square of the number of industries considered, although in practice the 
work is reduced by the fact that many of the entries in the input-output 
table are zeros because some industry, A, does not use as an input any of 
the products of some other industry, B. It can also be shown that the 
number of computational steps involved in solving the equations increases 
as the cube of the number of industries. Thus, the labor involved in an 
input-output analysis rapidly becomes astronomical as the breakdown of 
industrial classifications becomes finer. A table has been constructed for a 
model involving some 450 industries, but most computation has involved 
considerably fewer industries. Certainly even 450 industries is too coarse a 
breakdown for most detailed planning purposes in an economy where the 
number of items produced can be considered to go well into the millions. 


3. A Dynamized Input-Output Model 


The Leontief model has appeared in a number of modified forms. One 
which is of considerable analytical interest is a dynamic model in which 
specific account is taken of the interrelationship of current and past out- 
puts, and, in particular, of the building up of stocks of capital goods (fac- 
tories, goods in process, machinery, etc.). For purposes of this discussion 
we may consider that a current output can be used for any or all of the fol- 
lowing three purposes: for current consumption; as an input in the produc- 
tion of some other output; and, finally, as an addition to the economy’s 
stock of capital. The first two uses of an output have already made their 


4See Wassily W. Leontief and others, Studies in the Structure of the American Econ- 
omy, Oxford University Press, Inc., New York, 1953, Chapter 3. For a critical discussion, 
see Robert Dorfman, Paul A. Samuelson, and Robert-M. Solow, Linear Programming and 
Economic Analysis, McGraw-Hill Book Company, New York, 1955, Chapters 11 and 12. 
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appearance in the static input model. It is the last possibility, capital 
investment, which is the novel feature that characterizes the dynamic 
model. 

If the output in question is a building material or a piece of machinery, 
it is clear how it can be used to add to the stock of equipment, factories, 
and other productive facilities. But other outputs can also help to facilitate 
economic activities—production and marketing in the future. Inventories 
of raw materials and goods in process are obviously indispensable for 
smooth production, and effective marketing clearly requires stocks of 
finished goods. Hence the accumulation of outputs which are not used up 
when they are turned out can be essential for future production. This 
observation constitutes a tie-in between present and past (or between 
present and future), which is the crucial characteristic of any dynamic 
model. 

The mathematical relationships which make up the dynamic system are 
a fairly straightforward extension of the ordinary input-output equations. 
The dynamic conditions are of two kinds: 


1. Current output of each commodity must suffice to cover consump- 
tion demands plus interindustry demands plus demands for addition to 
inventory. Thus, the first equation of the preceding section would now read 


S 2 0.28 + 0.2C + 0.12 + 100 + (Kaui — Kut). 


Here K,, is the current (period f) accumulated capital stock of steel and 
Kat is therefore next year’s steel stock. Assuming away wear and depre- 
ciation, the difference, K,,,, — Kee, is, therefore, the amount which is added 
to steel capital stocks out of current production. The reason for the use of 
inequality rather than an equation will be discussed presently. 

2. The second type of relationship which constitutes the dynamic 
Leontief system requires that the capital stock be as large as is necessary 
to produce the planned output levels for the current period. For example, 
if each unit of steel output requires 4.2 units of steel capital goods (in the 
form of equipment, etc.), and if each unit of coal production requires 2.7 
units of steel capital, and each unit of railroad transportation output re- 
quires 3.6 units of steel equipment, we need a capital stock sufficiently 
large for all three purposes, i.e., we must have 


Kyu 2 4.28 +:2.7C + 3.6R. 


These are the basic requirements of the dynamic input-output system. 
Together they can help us to plan not only for present production, but for 
future output as well. The model takes explicit account of what must be 
put aside today in order to be able to achieve our plans for tomorrow. 

The presence of the inequality signs in the preceding relationships takes 
account of the possibility of overproduction and excess capacits. If, for 
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example, in the last relationship we end up with K,, greater than the sum of 
the terms on the right-hand side, it must mean that the economy’s steel 
equipment is not being fully used—there is excess capacity in this type of 
equipment. The reason such excess capacity can arise is that capital equip- 
ment is inherited from the past, and it can easily turn out that its composi- 
tion does not fit in precisely with current output needs. It may involve too 
much steel and too little coal for our current production pattern. In an 
extreme case, excess steel capacity can become unavoidable because we 
simply do not have enough coal on hand to run the machines. But, in any 
event, we have the choice of producing various different combinations of 
final outputs and investment goods. Consider one such set of commodities, A, 

whose manufacture uses a great deal of steel and very little coal, whereas 

another, B, involves the opposite sort of input requirement. If it is decided 

: to produce collection A, the economy may well end up with an excessive 

coal inventory, whereas if collection B is produced, there may be excess 

steel-equipment capacity. Even given the set of consumers’ goods which 

the economy wishes to turn out, different production patterns will arise 

depending on the quantities of the various goods which it is decided to put 

into capital investment. It follows that the dynamic input-output system 

cannot just be solved to give us a unique set of output requirements for any 

set of final output goals. Production goals can be achieved by a variety of 

means, and somehow society must make up its mind among them, pre- 

sumably on the basis of some sort of optimality computation. Planning for 

the long run cannot be reduced to a simple matter of the solution of a system 

of simultaneous equations as in the static input-output case. 


4. Some Theorems of Input-Output Analysis 


Before ending the discussion of input-output analysis it is appropriate to 
describe three noteworthy theorems on the subject and to indicate their 


function. 
1. The Samuelson substitution theorem: It will be recalled from the first 


section of this chapter that a restrictive assumption of the input-output 
analysis is the premise that there are fixed technological coefficients—that 
it takes X man-hours, Y units of raw material of a given type, and so on, 
to produce each unit of an output, and that there is no possibility of any 
other input proportion. The firm has no choice of using or rejecting, e.g., 
labor-saving devices. 

Professor Samuelson has proved that in some circumstances this restric- 
tion is not as serious as it appears. He has shown that even where variation 


5 See Chapters VII, VIII, and IX (written, respectively, by Paul A. Samuelson, 
Tjalling C. Koopmans, and Kenneth J. Arrow) in Tjalling C. Koopmans (ed.), Activity 
Analysis of Production and Allocation, Cowles Commission Monograph 13, John Wiley 
& Sons, Inc., New York, 1951. 
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of input proportions is possible, zt will never be advantageous provided that 
there are constant returns to scale, only one scarce input (labor, in the 
discussion of Section 2), and no joint products. In other words, the input- 
output proportions may be fixed as is assumed, but they will then be fixed 
by considerations of productive efficiency rather than immutable techno- 
logical requirements. For each commodity there will simply exist one most 
efficient capital-labor ratio (e.g., seventeen men per $1,000 of equipment), 
and no changes in the level of output of that commodity will affect this 
ratio. 

There is really a simple logic behind this result. We saw in Section 10 of 
Chapter 11 that with fixed mput prices and a linear homogeneous produc- 
tion function it will never pay to change input proportions no matter what 
the level of output. Suppose that there is only one scarce factor, labor, and , 
nothing else preventing the indefinite expansion of national output. This 
means that the real cost to society of the manufacture of any output or any 
other input must be calculated in terms of the amount of labor required to 
produce it. The real price of input A will be, say, two man-hours per unit 
that of input B will be twelve man-hours. With constant returns to scale 
and unchanged input proportions these labor “prices” will not change. It 
follows that any output should always be produced in a manner which uses 
the same input proportions. The most efficient input proportions will in 
fact be those which make the smallest (direct and indirect) drains on the 
economy’s scarce labor supply. 

Unfortunately, if there is more than one resource in limited supply, this 
substitution theorem no longer holds, and so we are then back to our original 
problem. If the fixed input-output proportions assumption is to be accepted 
at all, it can only be as an approximation to the technological facts of the 
case. 

2. The Hawkins-Simon conditions: Let us now turn to a second theorem 
of input-output analysis, one which is somewhat more abstract but of 
more general applicability than the one which was just discussed. 

After collecting the data for an input-output table it is conceivable that 
the solution of the corresponding input-output equations will yield one or 
more negative numbers. This would imply that negative outputs of some 
commodities are required in order to achieve the final consumption targets! 
Clearly something has gone seriously wrong in such a case. 

Hawkins and Simon have derived mathematical conditions which tell 
us the circumstances that will lead to such a pathological phenomenon and 
help us to understand it.* These conditions are useful as a check on input- 
output data to see whether a mistake has been made in collecting them. 


* See D. Hawkins and H. A. Simon, "Some Conditions of Macroeconomic Stability," 
Econometrica, Vol. 17, July-October 1949. 
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More important, these conditions provide us with mathematical require- 
ments which must be met by any acceptable input-output system, and the 
conditions can therefore be used as a basis for further theoretical analysis. 

In intuitive terms, what the Hawkins-Simon conditions show is that if, 
say, our solution calls for a negative coal output, this must mean that more 
than one ton of coal is used up (directly and indirectly) in the production of 
every ton of coal output. If we have such an unfortunate productive situa- 
tion, only a negative amount of coal will be left over for consumers’ use, 
and this deficit in consumer coal supplies will be smaller the smaller is the 
economy’s total production of coal. In such a topsy-turvy production 
system, the only way to meet consumer targets is to produce negative coal 
outputs— an obvious nonsense possibility. 

In the dynamic Leontief model the situation is only slightly less serious. 
If the Hawkins-Simon conditions are violated by the numbers in the input- 
output table, it may be possible, for a time, to meet consumer demands out 
of inventories. But, eventually, these inventories will be used up and it will 
be impossible to build them up again because an attempt to produce more 
will only hasten the drain on stocks since more of the items in question must 
be used up by the production process than it is able to turn out. 

The nature of the Hawkins-Simon conditions can perhaps be made 
more specific with the aid of a graph. Consider a very simple two-industry 
(steel-coal) input-output model in which neither the steel nor the coal 
industry uses up any of its own product. Then we have the input-output 
equations 


(1) S= aC+T, 
(2) C —bS T Ts 


where S and C are steel and coal output, respectively, T, and T, are the 
final net output targets for steel and coal, and a and b are constants. The 
first equation, the demand curve for steel, may be represented by a straight 
line (drawn as the flatter line) in Figure 1. Similarly, the second equation, 
the demand for coal, may be solved for S in terms of C, and rewritten as 


S = (1/b)C — Q/0)T.. 


This curve can then also be plotted in Figure 1 (the steeper line). The inter- 
section point, B, gives us the solution to the input-output equations, the 
required outputs of coal and steel thus being OC; and OS;, respectively.” 


7 Note the role of the limited labor supply in the input-output system. Its limited 
availablity means that only input combinations represented by points on or below the 
labor constraint line, LL’, can be produced (shaded region). Points like P, which lie be- 
yond this line, represent outputs which cannot be produced because the requisite labor is 
not available. Fortunately, in the case shown, the solution point B is feasible. Note also 
the similarity to the linear programming diagrams. 
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DEMAND FOR COAL 
S-(I/b) C-(I/b) Tc 


DEMAND FOR STEEL 
S=aC + Ts 


LABOR CONSTRAINT 


Uv COAL OUTPUT 


Figure 1 


But suppose the two lines were parallel or that the steel demand curve 
were steeper than the coal demand curve. In that case, the two lines would 
not intersect in the positive quadrant. The input-output system would have 
no solution, and the Hawkins-Simon conditions must be violated. What are 
those conditions in this case? To prevent the problem of nonsolvability, the 
slope of the coal demand equation (1/b) must exceed that of the steel de- 
mand equation (a) so that we must have 1/b > a. That is the Hawkins- 
Simon condition for this simple model. 

The economic implication of these conditions which was given above is 
now easily shown to apply in this case. Substitute the coal demand equation 
(2) into the steel demand equation (1) to eliminate the coal output variable, 
C, and obtain 


S = a(bS + T.) + T, = abS + aT. + T.. 


This last equation tells us that if steel output increases AS units, the amount 
of steel needed to produce the coal which is used to produce that additional 
steel will be ab AS. But if the Hawkins-Simon conditions are violated so 
that 1/b < a, we must have ab > 1, i.e., more than a ton of steel (ab 
units) will be used up in the course of producing an additional ton of steel! 

3. Series approximation to the solution: A final theorem to be discussed 
here arises out of the need for computationally efficient methods of solution 
of the simultaneous equations of the static Leontief system. Since the work 
of numerical computation in the solution ur a set of simultaneous equations 
grows so complex as the number of equations increases, it becomes highly 
desirable to find labor-saving methods. One method® which has received 
much attention involves an approximate solution which is analogous with 
the computation of the first few terms in the multiplier series 1 4- c -+ c? + 


8 See, e.g., Frederick V. Waugh, “Inversion of the Leontief Matrix by Power Series,” 
Econometrica, Vol. 18, April 1950. 
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c + +++ as an approximation to the value of the multiplier 1/(1 — c). As 
we know, in the multiplier case this will work if the marginal propensity to 
consume, ¢, is less than unity. In the Leontief computation we have a similar 
condition which states that this procedure will work if the sum of the first 
n elements in any column in an n-industry input-output table is less than 
unity. We may generally expect this to be the case for an operating industry 
in a profit economy because this total is the sum of the costs of all the inputs 
except the labor input going into the production of a dollar’s worth of any 
commodity, and the costs of these inputs must usually be no greater than 
one (dollar), for otherwise it will not pay to produce the item. 

Again, there is a simple piece of intuitive logic to the procedure. The 
purpose of the input-output computation is to answer questions such as 
“How much steel production is needed to satisfy consumer demands?” And 
the answer can be given in the form of the following infinite series: 


The economy will have to produce 

as much steel as consumers will use directly 

plus as much steel as is needed to produce other final consumer 
products 

plus as much steel as is needed to produce the inputs for these 
final consumer products 

plus as much steel as is needed to produce the inputs which are in 
turn used to manufacture those inputs which go into those 
final products 

plus as much steel as is needed for the inputs to make the inputs 
to make the inputs for the final products 

and so on, ad infinitum.? 


The idea in the series-approximation formula is to compute the total 
Steel requirements involved in several such stages (say the first fifteen of 


° This is readily illustrated with the aid of a trivial one-industry (steel) model, in 
which as tons of steel are used up in producing one ton of Steel. Here we have one input- 


output equation 
S=aS+T, 


where T is the final output steel target. This equation has the obvious solution 
Sü—-a)-T or S- T/Q — o). 


But the same solution can be arrived at by the multiplier argument used above. To end 
up with T tons of steel we need to produce the T tons plus aT tons (to be used up in 
producing the T tons) plus a(aT) = a?T tons (to be used in producing the aT tons), etc. 
Thus we have the infinite geometric series 

S=T+aT+a?T+---, 


which has the well-known solution S = T/(1 — a). 
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these rounds), and to take this subtotal, after some upward adjustment, 
as an estimate of total required steel production. 
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The term "activity analysis" may be taken to refer to the applica- 
tions of linear programming methods to general equilibrium theory. The 
last few years have seen a new burst of effort devoted to this area, which, 
at least until the 1930’s, had remained pretty much as it was left by Walras. 
The three outstanding developments relate to the solvability of the Wal- 
rasian equations, the development of general equilibrium growth models, 
and the application of general equilibrium theory to welfare economics. In 
all three cases the main advance has consisted in the development of power- 
ful methods rather than in the discovery of surprising new theorems. For 
this reason much of the discussion which follows is devoted to the descrip- 
tion of mathematical arguments and analytic techniques, and its economic 
content may on first reading leave one rather disappointed. 


1. The Existence and Uniqueness Problems 


Walras was much concerned with the solvability of his equation system. 
That is, he wanted to be sure that the system of equations he had set up 
sufficed to determine the values of his variables—the prices and quantities 
of the economy’s outputs and inputs. Some writers approached this by 
counting the number of his equations and unknowns. They found he had 
the same number of equations as unknowns and assumed that the problem 


was solved. 
Unfortunately, the matter is much more complicated. We usually ex- 
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pect a supply and a demand equation (curve) to determine a single equilib- 
rium price-quantity combination, but this is certainly not true of either 
of the following pairs: 


Demand: Q = 1,000 — 5P Q = 1,000 — 5P 
Supply: Q = 900 —5P 2Q = 2,000 — 10P, 


where P is price and Q is the quantity sold. The pair of equations on the 
left has no solution (no solution exists)—any price-quantity combination 
which satisfies the one cannot possibly satisfy the other because Q + bP 
cannot be both 1,000 and 900 at the same time. The trouble here is that 
the supply and demand curves are parallel straight lines and never inter- 
sect. In this case we say that the equations are inconsistent and the system 
is overdelermined. By contrast, the other set of equations offers us an 
embarrassment of riches. It is compatible with an infinite number of 
price-quantity combinations. (The solution is not "unique.") In fact, 
since negative prices and quantities have not been excluded, every price 
can be an equilibrium price. In this case the difficulty is that the supply 
and demand curves coincide, so that at every point of this single-curve 
demand will equal supply. Here we say that the equations are not inde- 
pendent and the system is underdetermined.* 

If there are scarce resources, the Walrasian system may get into trouble 
in yet another way—the solution to the equations simply may not be 
feasible because the available resources do not suffice to produce it. It may 
rightly be suspected that this is where programming enters in, for we are 
almost? back at the production-with-limited-capacity problem. 

An ezistence theorem (a theorem which states that some equation or 
set of equations possesses at least one solution) does not tell us anything 
about the operation of the economy—rather, it tells us something about 
the operation of the Walrasian model. We know by observation that the 
market somehow determines unique prices and quantities. Thus the 


1 Just as it might be if we had three well-behaved equations in two unknowns. 

* Ag would ordinarily be the case where a system consists in one equation in two 
unknowns. 

* But not quite—because we have not found anything to maximize. It should be 
recalled that this capacity problem also occurred in the input-output analysis where 
the labor requirements of any output target had to be checked against the available 
labor supply. Incidentally, note that if resources are available in the wrong proportions 
it may be impossible to use them up completely, i.e., to satisfy the resource-use equa- 
tions of the Walrasian system. For example, consider a world of only one output, a 
unit of whose production requires 2 hours of labor and 3 pounds of raw material. If the 
available amounts of labor and raw material are 400 hours and 300 pounds, respectively, 
the two equations 2Q = 400 and 3Q = 300 clearly cannot be solved. They must be 
replaced by inequalities which indicate that some labor or raw material will be left 
unused. 
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market’s "solution" always exists. An existence analysis can serve only 
as a test for a general equilibrium model, in that if it turns out that the 
model possesses no solution we will perhaps want to reject it on the grounds 
that it may therefore be neither very helpful analytically nor very realistic. 

An existence theorem is a rather esoteric idea. It assures us that a 
problem can be solved but it may tell us nothing about how to go about 
solving it. Nevertheless, it is more important—even to an economist— 
than it may at first appear to be. We know that a system which has passed 
the test of an existence theorem can contain no contradictory elements 
since, clearly, any contradictions within the system would make a solution 
impossible. This may even have some direct economic implications. For 
example, an existence theorem for a system which postulates both full 
employment and an “‘ideal” allocation of resources proves that these two 
desiderata are not incompatible goals. In other words, such a theorem can 
tell us whether we are pursuing aims which involve having our cake and 
eating it. 

An existence theorem or a uniqueness theorem (a theorem which states 
that the system has no more than one solution) can have further economic 
relevance in another way. Often it will turn out that we can prove an 
existence theorem or a uniqueness theorem for a system only if it satisfies 
some special requirements. For example, we shall see later how such a 
restriction on the nature of consumer demand is used to prove uniqueness 
in a general equilibrium model. Now these requirements can be highly 
suggestive in indicating conditions which may be necessary for such an 
equilibrium to occur in the real world. This will become clearer in our 
discussion of ite uniqueness problem below. 


2. Solution of the Existence Problem 


Existenee theorems are closely tied in with the so-called fixed-point 
theorems. First, let us see what is meant by a “fixed point." Suppose we 
have some functional relationship Y — f(X) which associates different 
values of Y with X. Then a fixed point is a specific value of X, say X = X* 
(some »umber), for which Y* = f(X*) = X*, i.e., for which the value of 
Y is e4ual to the value of X. The reason such a value of X is called a 
fixed point ean be made clear with the aid of the following illustrative 
table, which gives Y as a function of X: 
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These data may be given the following geometric interpretation: We have 
four markers (e.g., paper clips) on a rule, one at each X figure, i.e., one at 
the l-inch mark, one at the 2-inch mark, etc. The function gives us direc- 
tions for moving these markers. It tells us to move the paper clip at the 
l-inch mark to the 9-inch mark, the one at the 2-inch mark to the 7-inch 
mark, etc. Note, however, that the instructions tell us not to move the 
paper clip from the 5-inch mark. That is, X = 5 = Y isa fixed point for 
this function. As another example, we note that X = 1 is a fixed point for 
the equation Y = 3 — 2X because for X = 1, Y =3 —2=1. 

But how are fixed-point theorems involved in existence proofs? The 
answer is that for a wide class of problems they are practically one and the 
same thing. Suppose, for example, we want to prove that there exists a 
root for the equation f(X) — 5 = 0. This is the same as finding a fixed 
point for the equation Y = f(X) — 5 + X. For if X = X* (where X* is 
some number) is such a fixed point, we have Y = X* so that the equation 
becomes X* = f(X*) — 5 + X*. Subtracting X* from both sides, we see 
that 0 = f(X *) — 5; i.e., X* must be a root of our original equation. More 
generally, we see that X * is a root of the equation 0 = G(X) if and only 
if it is a fixed point for the related function Y = G(X) + X. 

There is another way in which we can see the relation between a fixed- 

point theorem and an existence proof, this time for the solution of a pair 
of simultaneous equations. Suppose we are, for example, trying to find a 
solution to the standard supply-demand problem. A clumsy way to go 
about it is to draw the demand and supply curves one above the other as 
in Figure 1. We then pick some price, OP,, on the Y axis of the demand 
diagram, see what quantity, OQ, buyers are willing to buy at this price, 
and then, by moving vertically to the supply curve, we find the price; OP,, 
at which sellers are willing to supply that quantity. In this way we obtain à 
relationship which gives supply price as a function of demand price, 
OP, = f(OP,). If it turns out for some particular demand price, OP, and 
its associated supply price, OP? = f (OP*), that we have OP* = OP%, 
it is clear that price OP? is a fixed point for this function. But it is also 
obvious that OP* and OP} are the equilibrium supply and demand prices. 
We see, then, that if the function which relates supply price to demand 
price has a fixed point there must exist a solution to the supply-demand 
equations. This, as we shall see presently, is in essence an outline of the 
McKenzie proof of the existence of equilibrium in the general equilibrium 
system. 

The proofs for fixed-point theorems are generally very deep and com- 
plex. However, there is one very simple case which is usually used as an 
illustration. This simple theorem states that in any two-variable equation, 
Y = f(X), if f(X) is continuous (roughly, if there are no breaks in its 
graph—kinks, though, are permitted) and if Y is never negative and never 
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larger than some (any prespecified) number N, the function will possess a 
fixed point. This is shown in Figure 2, which plots the function f(X) as 
FF’ between X = 0 and X = N. The graph also contains a 45° line through 
the origin, and it is clear that any point such as P where FF’ intersects the 
45° line is a fixed point, for there we will have Y = X. There are two pos- 
sibilities: If Y = 0 at X = 0, this is our fixed point. On the other hand, if 
Y = Oat X = 0 so that FF’ starts out above our 45° line, it can “try to 
avoid” the 45° line by staying above it. But since, by assumption, Y can 
never be greater than N, FF’ will be kept from rising above the upper 
dotted line and the 45° line must catch up to it and corner it at X = N, 
if not sooner. 

The first proof of the existence of a solution to the Walrasian system 
was published by A. Wald in 1933. His proof was exceedingly difficult. 
Like the authors of the more recent proofs, Wald had to impose further 
assumptions on the Walrasian model. Several of Wald’s assumptions have 
been considered excessively restrictive and economically unjustified. Since 
then several alternative proofs have been offered involving much weaker 
and more plausible assumptions. 

Without going into details we can now very briefly outline one of these 
proofs, that of Lionel McKenzie. His model contains two sets of inequalities 
and one set of demand equations. The first set of inequalities states that 
production takes place under constant returns to scale, and uses up no 
more than the physical resources which are available. The second set of 
inequalities states that since we are dealing with a case of perfect com- 
petition, profits will be zero.* All products sell at a price which is no higher 


‘ It turns out, surprisingly, that these two sets of inequalities are the respective 
inequalities of the primal and dual programs in the linear programming production 
model of Chapter 6! 


554 Activity Analysis and General Equilibrium Chapter 23 


than the cost of production. (Processes which involve a cost greater than . 
the price of the commodity will, of course, not be used.) The problem is, 
then, to show that there exists a set of factor prices, commodity prices, 
and outputs which are consistent with those resource limitations, profit 
limitations, and the market demand relationships. 

We now proceed exactly as in the discussion of Figure 1, only here we 
must deal with many prices and quantities at once. Pick some arbitrary 
set of prices and find the quantities which the demand functions tell us 
consumers will buy at these prices. Suppose that these quantities are 
producible within the given resource limitations. Then use our second 
set of inequalities to see what supply prices are just compatible with these 
output plans and the no-profits perfect-competition requirement. In this 
way we deduce a set of supply prices from any assumed set of demand 
prices exactly as we did in Figure 1. As in the discussion of that diagram, 
to show that an equilibrium price-quantity combination exists, it is neces- 
sary to prove that the assumed demand prices and the deduced supply 
prices can coincide. It is here that we must appeal to a fixed-point theorem. 
Tt can be shown that there is sux’ 4, theorem which applies to our problem 
and proves that there is one set of demand prices which is the same as the 
set of supply prices deduced from it. At these prices and quantities, then, 
the demand conditions, the production conditions (resource limitations), 
and the profit conditions are all satisfied. This, in outline, is the McKenzie 
proof of the existence theorem for a general equilibrium model. 


3. Solution of the Uniqueness Problem 


Uniqueness is mathematically somewhat simpler to prove, though 
perhaps it is less plausible economically. There is really no good reason to 
believe that there will be no multiple intersections of supply and demand 
curves and hence a multiplicity of equilibrium points. 

Further, the demand assumption which is used to prove uniqueness, 
although plausible enough for an individual consumer, is, at best, question- 
able when applied to the economy. This premise, which is employed in the 
uniqueness proof, turns out to be the basic assumption in Samuelson's 
revealed preference analysis, which we now review briefly (Chapter 14, 
Section 1). 

Suppose that an individual buys & collection of commodities A rather 


5 It can be shown that at the prices which finally emerge the quantities demanded 
willindeed be feasible. Their production will require no more than the available re- 
Bources. 
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than some other collection B which is also available on the market. Pre- 
sumably he will have made this choice either because he likes A better 
than B or because A is cheaper than B. If, in fact, A is more expensive 
than B when the consumer buys A, then the second possibility is ruled 
out—our consumer must have bought A because he actually prefers it to 
B. We therefore say, when a consumer buys the more expensive of these 
two collections, that A has been revealed preferred to B. Suppose that a 
different set of prices could have led the consumer to change his mind and 
buy B. If his tastes do not change so that he still prefers A to B, he pre- 
sumably will not buy B when it is more expensive than A. At least this 
will be the case if his tastes are consistent, for otherwise his buying of B 
rather than A when B is the more expensive collection reveals that he also 
prefers B to A! That is the basic revealed preference premise. In sum, it 
states that consumer tastes are consistent in the sense that if one set of 
prices reveals A to be preferred to B, then there exists no other set of prices 
which can reveal B to be preferred to A, i.e., which makes B more expensive 
than A and yet leads the consumer to buy B. 

Now although this is a plausible requirement for consistent consumer 
behavior, it has much less intuitive appeal when applied to market de- 
mand. It may be perfectly consistent for the community to buy A rather 
than B when A is more expensive and yet buy B rather than A when prices 
change so that B is the more costly. This is because the price change re- 
distributes real income among consumers whose expenditure patterns 
differ. Thus different consumer groups may foot the bulk of the bill in the 
two cases.? Despite these reservations, let us examine the line of reasoning 
which shows that the revealed preference premise for the market is violated 
if we have a multiplieity of equilibria. But, before the argument can be 
completed, one more preliminary theorem must be explained. This pre- 
liminary result will also be needed in the discussion of a later section. 

At any fixed level of input and. output prices, competitive outputs will 
tend to maximize total net profits; e.g., if there is any opportunity to in- 
crease profit by increasing some output at the expense of another, individual 
businessmen will make this switch until the opportunity disappears.’ 
But it shall be argued now that this also means that the value of the final 
products (at these fixed prices) will also be maximized. 

To make the argument clearer, consider a simplified production ar- 
rangement in which the economy’s fixed resources are used only by the 
producers of raw materials who sell their entire product to the makers of 


9 For a fuller discussion see J. R. Hicks, A Revision of Demand Theory, Oxford 


University Press, London, 1956, pp. 54-58. 
7 What goes wrong with this argument in the presence of external economies or 


diseconomies? (See Chapter 21, Section 11.) 
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finished goods. The total profits in the economy are given by 


The profits of 1. The value of finished products. 


finished-goods = |2. Minus the cost of produced raw 
producers materials. 
plus 


The profits of — (3. The value of produced raw materials. 
raw-materials . Minus the cost of the economy's 
producers fixed resources. 


ll 
> 


But the cost to finished-goods producers of their raw materials is exactly 
the same as the value of the product (revenue) of the raw-materials pro- 
ducers. Hence, in adding items 1 through 4 on the right above, to obtain a 
figure for the total profit of the economy, items 2 and 3 must cancel out. 
We see, then, that the total profit earned in the economy will equal the 
value of the output of finished products minus the cost of the economy’s 
fixed resources. 

With a given set of positive prices for all products and resources, what 
can businessmen do to increase the economy’s total profit? Since, by 
definition, the quantities of the various fixed resources are fixed, then with 
the prices of these items given, nothing which businessmen do will affect 
the cost of society’s scarce resources. Hence, anything which can be done 
to add to the total value of final output will add an equal amount to total 
profits (equals the value of final output minus the cost of fixed resources). 
Tt follows that businessmen will have maximized the total profits of the 
economy if, and only if, they have maximized the total value of finished 

. products. And, as we saw at the beginning of the discussion, with any given 

, set of positive prices, the maximization of total profit may be expected in 
competitive equilibrium. Therefore, competitive equilibrium will involve 
maximization of the total value of all finished commodities produced in the 
economy. 

We are now only one step from the end of the uniqueness argument. 
Suppose, on the contrary, that equilibrium is not unique—that there are 
two alternative equilibrium output combinations A and B, each with its 
own equilibrium prices. We have just seen that, at the prices which lead 
to the manufacture of A, the value of final outputs will be maximized, 
ie, A will be at least as expensive as B. Similarly, at the prices at which 
B is produced, B will be at least as expensive as A. But if A and B are 
competitive equilibria, demand must match supply, that is, A must be 
demanded in the first situation (when it is most expensive) and B must 
be demanded in the second (when it is the most costly). A is then revealed 


‘| 
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preferred to B and vice versa. This clearly violates the revealed preference 
assumption for the market, so that if that assumption is to hold, there 
cannot be two equilibrium outputs, A and B. The equilibrium output 
must be unique. 


4. The Von Neumann Model of an Expanding Economy 


From existence and uniqueness theorems we turn now to the second 
of our activity-analysis topics: general equilibrium growth models. Two 
years before Wald published his existence proof, von Neumann delivered a 
paper which is perhaps the most remarkable virtuoso performance in the 
literature of mathematical economics. In this article von Neumann de- 
veloped what is presumably the first general equilibrium analysis of eco- 
nomic growth. In this respect (though not in some of its other properties) 
the model is more complicated than the Walrasian system. Nevertheless, 
von Neumann developed an existence theorem for this model using a 
fixed-point theorem in a way which is somewhat similar to the later exist-. 
ence theorem for the Walrasian economy. Moreover, in the course of this 
argument there are clearly discernible features of both linear programming 
and game theory,’ and, in particular, of the duality theory of linear 
programming. 

The structure of the model itself can be outlined fairly briefly. Von 
Neumann describes an economy characterized by a linear homogeneous 
production function and in which all outputs serve only as raw materials 
for further production. Consumption can be interpreted as the process 
whereby finished goods are used as inputs in the production of labor. Thus 
consumption also becomes a purely technological phenomenon and ordi- 
nary demand relationships disappear from the model.’ 

The production function is described as a set of processes each of which 
turns some inputs into outputs, all in fixed proportions. To make sure his 
economy is completely integrated so that it is not decomposable into un- 
connected subsectors, von Neumann also assumes that each commodity is 
either an input or an output in every process, e.g., it is assumed that every 


8 As we saw in the chapter on game theory, every zero-sum, two-person game can 
be interpreted as a pair of dual linear programs (and the converse is also true), so 
it is really not so surprising that elements of both of these are involved in the von 


Neumann model. 
? Actually, the absence of any demand functions removes a major difficulty in the 


proof of the existence theorem. In this respect the problem of the existence proof is 
considerably simpler than that for the Walrasian system. 
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process either produces size 7-B brown moccasin style shoes or uses them 
as a raw material! 

Suppose now that this economy is expanding; the manufactured out- 
puts of the production processes taken together exceed the outputs of the 
preceding period (which are the inputs for the current period’s products). 
Moreover, assume with von Neumann that there are no limited supplies 
of land, labor, or other factors to put an end to this expansion. Von Neu- 
mann then asks whether there is a constant equilibrium rate of growth of 
the economy which will yield no profits, as required by perfect competition, 
and which satisfies the technological requirement that the process intensities 
during any period require no more than the available raw-material inputs 
(the outputs of the preceding period). 

Equilibrium is defined as a constant proportionate rate of growth of all 
outputs, inputs, and process intensities which satisfies the profit and 
technological conditions just described. (Here a proportionate rate of 
growth, P, means that in every period each output is at least P per cent 
higher than it was in the previous period. ) 

Let « be the equilibrium rate of expansion of the slowest-growing item 
of the economy, i.e., the production of each item grows at à rate greater 
than or equal to « per cent per year. Let the money rate of interest be 8 
dollars per annum. The no-profit condition then can be restated as follows: 
The money outlay on the inputs of any process plus the interest cost of 
that money for one period must be greater than or equal to the value of 
the outputs of that process. Similarly, the technological cendition can be 
formulated as follows: The sum of the current inputs of any item in all 
processes (which is o per cent greater than the amount used in the pre- 
ceding period) must be less than or equal to the output of that item in the 
preceding period. That is, the amount of coal used during the current 
period [equals (1 + «) multiplied by the amount of coal used last period] 
must not exceed the coal made available by last period's output. We see, 
then, how inequalities play a fundamental role in this model as they do 
throughout activity analysis. 

Von Neumann then proves the following results: 


1. There exists one such equilibrium rate of growth œ (existence and 
uniqueness). 


10 This is not as ridiculous as it may sound—probably some steel worker wears out 
such shoes somewhere in American steel production. Moreover, there have been two 
recent articles which analyze the behavior of a von Neumann economy without the use 
of this assumption. See John G. Kemeny, Oskar Morgenstern, and Gerald Thompson, 
“A Generalization of the von Neumann Model of an Expanding Economy," Econo- 
metrica, Vol. 24, April 1956; and David Gale, ‘The Closed Linear Model of Production,” 
in Linear Inequalities and Related Systems, H. W. Kuhn and A. W. Tucker (eds.), Annals. 
of Mathematics Studies 38, Princeton University Press, Princeton, N.J., 1956. 
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2. This equilibrium rate of growth will equal the interest rate so that 
the rate of increase of output will just exactly suffice to cover the interest 
cost of investment in inputs. This is an intuitively obvious consequence 
of the no-profit condition. Thus, there will be a unique value for both a 
and 8. However, it should be noted that a von Neumann model can be 
consistent with many equilibrium output-price combinations—in this 
respect, then, the solution is not unique. 

3. There may be some processes whose employment involves a financial 
loss. These processes will not be used—i.e., all processes actually operated 
will yield exactly zero profits, as the theory of perfect competition has 
always taught us. 

4. Some outputs may grow at a rate greater than e per cent per period. 
There will be a surplus of such an item over and above what is required as 
input in the next period for the equilibrium growth of the economy. Be- 
cause there will be an excess supply of each such commodity, it will be a 
free good; that is, its marginal utility and hence its price will be zero." 

5. There will be no sustainable rate of growth greater than the equilib- 
rium rate of growth o. For if there were available alternative processes 
capable of yielding a higher growth rate, a’, then the a growth rate would 
not be consistent with equilibrium. Entrepreneurs would switch to these 
alternative processes because with interest at the old rate, 8, they would 
make a profit. With the interest rate then raised to B’ = a’, to eliminate 
this profit the old nonmaximal growth-rate processes would only be oper- 
able at a loss. 

The extreme abstraction involved in the von Neumann model hardly 
needs to be pointed out. Later work has removed some of the more un- 
palatable assumptions. The conclusions are of considerable interest in 
themselves. Nevertheless, primary interest in the model continues to reside 
in the analytic tools which were developed and exhibited with its aid. 


5. Activity Analysis and Welfare Economics 


Doubtless the best-known theorem of elementary welfare analysis 
asserts that a long-run, perfectly competitive equilibrium will yield an 
mal allocation of resources. Not only is the theorem elementary and 
has already been shown in Chapter 21—it is also, strictly 
true only under some fairly restrictive 


opti 
well known—as 
speaking, untrue, oT rather, 
assumptions. 

In recent years there has been much work devoted to the development 
of an alternative activity-analysis proof of this theorem. It may well be 
E 4 


11 Results 3 and 4 are directly related to the second duality theorem of linear program- 
ming which is described in Section 3 of Chapter 6. 
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asked why this has been thought to be necessary. For one thing, activity 
analysis has made no attempt to dispute the restricted validity of the 
result. In fact, no way has even been found to apply the methods of ac- 
tivity analysis to external economies and the related difficulties which are 
incompatible with the optimality of perfect competitive equilibrium. 
Rather, the new approach has been helpful in another way. 

The standard welfare economics deals only with commodities which 
are actually bought in the market and not with those which are free goods 
or for which no customers can be found at a profitable price. For old- 
fashioned welfare theory leans heavily on the marginal conditions of 
equilibrium, e.g., the condition of equality of price ratios to the marginal 
rates of substitution. But these conditions need not hold for free or unsalable 
goods. In old-fashioned terms, if each consumer chooses not to buy a 
commodity, the marginal utility of that item may well be less than its 
price (note the inequality again). Moreover, the cost of production of such 
an unsalable good must be greater than its price. For free goods the ratio 
of prices is not even defined. It follows that the standard version of the 
theorem which we are discussing must be restated to read that (where the 
theorem is valid) a competitive economy will allocate resources optimally 
among commodities which are salable without loss and which are not free. But 
which commodities will these be? We cannot assume we know the answer 
in advance, for the answer is an economic question of costs of production 
and demand patterns. Moreover, though our intuition may tell us that 
this is so, we must prove rigorously that there can be no preferable alloca- 
tion of resources to free or unsalable goods.” 

Old-fashioned welfare theory, by taking marginal utility to equal 
price, may end up requiring negative consumption of an unwanted com- 
modity since even with zero consumption its marginal utility may turn out 
to be less than its price. Similarly, it cannot preclude the economic ab- 
surdity of negative prices for “free goods.” But since activity analysis 
can cope with inequalities, it can specify that (1) prices and quantities 
exchanged must all be greater than or equal to zero, (2) the average cost 
of production must be equal to price for all items which are produced, and 
greater than the price at which any item that no one considers worth pro- 
ducing can be sold, and (3) production must not exceed the levels made 
possible by the available resources of society. Subject to these and the 
limitations of competition, businessmen and consumers are then taken to 
do the. best they can for themselves. Marginal equalities and inequalities 
do not even make an explicit appearance. 


12 Pigou long ago pointed out that the production of some items which it would be 
unprofitable to produce under pure competition can conceivably yield a net benefit to 
society. See The Economics of Welfare, 4th edition, Macmillan & Co., Ltd., London, 1938, 
pp. 283, 810-811. However, Pigou’s case requires decreasing costs, and these cannot be 
handled by the activity-analysis approach as developed to date. 
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Before outlining the proof of the theorem, let us first recall two related 
concepts. A production arrangement is called efficient if any alternative 
productive arrangement which increases the output of some commodity 
must also involve a decrease in the output of some other commodity. The 
motivation of the definition is obvious. Any productive arrangement 
which is not efficient in this sense requires that the economy forego the 
opportunity to get something for nothing—the opportunity to increase 
the output of some item, X, without giving anything up in exchange. Re- 
lated to efficiency is the concept of Pareto optimality (cf. Chapter 21, 
Section 31, for the reason for this nomenclature). A situation is said to be 
“Pareto optimal” when it is impossible to effect a change which benefits 
some individual without any deleterious effects on someone else. Efficiency 
is then a purely technological concept whereas Pareto optimality is the 
corresponding concept for individuals as consumers and in their other 
economic roles. 

Let us now see how activity analysis can be used to prove that every 
competitive equilibrium is technologically efficient and that every efficient 
output combination is a competitive equilibrium, i.e., that for each efficient 
point there can be found a set of prices which would under perfect com- 
petition produce the efficient output combination in question. This part 
of the theorem amounts essentially to the standard result that a competi- 
tive output can occur at and only at any point on the production possibility 
locus (transformation surface), the graph which shows the various output 
combinations which society can produce with its available resources (curve 
TT’ in Figure 3). For the production possibility locus is the locus of all 
efficient points. For example, we see that, with output OX, of X, the 
largest possible output of Y is OY so that point C is efficient. But, on the 
other hand, a point like A which lies inside the transformation locus TT’ 
represents an inefficient combination of output x and Y because it is 
possible to move to an efficient point like B which lies on TT” northeast 
of A, so that B involves greater outputs of both commodities. Note that 
although C is also efficient, as we have just seen, activity analysis as so far 
described has not settled whether it is or is not more desirable socially 
than is A.'3 

How do we know that a competitive output is technologically efficient? 
The answer is simple. With fixed prices, an inefficient output cannot in- 


that there will usually be many efficient output combinations repre- 
that make up the transformation locus. The locus can be found with 
the aid of programming techniques. The trick is to choose any output of X, say OX), 
and find the mazimum output of Y permitted by the resources left over from the pro- 
duction of that quantity of X. This is clearly a programming computation, and it will 
show that OY, is the maximum amount of Y then producible. In this way point C 
on the transformation curve will have been located, and, by starting with other values 
of X, other points on TT” can be found in the same way. For the details of the analogous 
derivation of the contract curve equation, see footnote 11 of Chapter 21. 


13 We see, then, 
sented by the points 
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volve a maximum money value of output. For we can increase the output 
of, say, X without decreasing that of any other commodity and end up 
with an output combination whose money value is obviously increased. 
Thus, with fixed prices, the money value of E in Figure 3 is clearly higher 
than that of A. But we have seen earlier (Section 3) that a competitive 
equilibrium necessarily involves a maximum value of output when valued 
at the equilibrium prices. Hence, no competitive equilibrium can be 
inefficient. 

More difficult is the proof of the converse, which states that every 


OUTPUT 
or Y 


Y 


o Xi T/ OUTPUT 
* OF X 
Figure 3 Figure 4 


efficient output combination is a competitive output. However, the method 
of proof is interesting in and of itself. This employs an “intuitively obvious" 
mathematical theorem whose proóf is, in fact, rather difficult. The theorem 
states that, given an (N-dimensional) convex geometric figure (convex 
set) and any point R on the boundary of or outside this figure, it is always 
possible to draw at least one line (N-dimensional hyperplane) through R 
in such a way that the convex figure lies entirely on one side of this line.” 
Thus, in Figure 4 the shaded region is convex. Through points A, B, and 
C on its boundary, and point D outside the figure, lines have been drawn 
which have the required property. Such a line through a boundary point 
like A, B, or C is called a supporting line (plane) of the convex set. Such a 
line can be taken to divide the plane into two half-planes (half-spaces) and 
in each case the shaded figure lies entirely within one of the half-spaces 
produced by the lines in the figure. Supporting lines represent a generaliza- 
tion of the concept of tangency to cover the case where the boundary curve 
has a kink (e.g., at point B—note that there are many supporting hyper- 
planes at such points). 


'* For the definition of a convex set, see Chapter 7, Section 4. The theorem dets not 
hold for nonconvex regions as can be shown with the aid of Figure 5b in Chapte: 7; 
where any line through point E must intersect the nonconvex shaded figure. 
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Consider now one of the supporting lines of the convex set, e.g., line 
LL’ through point A in Figure 4. Since the convex set lies entirely on one 
side of such a line, then (if the line is not vertical) either no portion of the 
convex set will lie above this line or no part of the set will lie below this 
line. Suppose the convex set does not lie above the supporting line in 
question (as is the case with LL’). Then any point of the set (e.g., point 
P) must lie on or below LL’, and any line parallel to LL’ which goes 
through a point in the convex set (e.g., line MM" through point P) must 
either coincide with LL’ or lie entirely below it. In other words, such a 
supporting line must be the highest of all the lines which are parallel to it and 
which meet the convez set at any point. We shall employ this result by inter- 
preting the family of lines parallel to our supporting line as a set of price 
lines which, as usual, involve price ratios that are given by the (constant) 
slope of these lines. Then the italicized result may be translated to read as 
follows: If only output combinations which are represented by points 
within the convex region are attainable, the highest attainable price line 
goes through A, the point of tangency with the convex region. Of course, 
this final statement sounds at least vaguely familiar. 

Now to get back to our theorem that every technologically efficient 
point is a competitive equilibrium. We note first that the set of points 
representing all the feasible outputs in a linear program must form a 
convex geometric figure.'® Suppose now that some output is efficient. We 
have seen that the point which represents this output combination must 
lie on the upper boundary of the feasible region (compare curve TT’ in 
Figure 3). Moreover, we know that through every such point there passes 
(at least) one supporting line. This line can be interpreted as a price line 
giving the value of the output through A, and its slope can be taken to 
represent the ratio of the prices of X and Y. In other words, for any effi- 
cient point there will always exist relative prices at which the efficient 
point maximizes the value of output.!6 

With these prices, 


moreover, businessmen under pure competition will 
(in the absence of ext 


ernal economies and diseconomies) be motivated to 
produce the technologically efficient output in question, for, as we have 
argued, it will pay them to maximize the value of output. Thus any such 


15 See footnote 1 of Chapter 7. 

16 The economic interpretation requires that the 
But we can show that at an efficient. point the slo 
for we know that if all prices are Positive, 
(Indeed this is why we want the price line to have a negative slope.) But suppose the 
contrary, that one of the “prices” deduced from the supporting price line is negative. 
We can then increase the value of output by decreasing the quantity of the commodity 
whose price is negative, which means that we can increase output value by leaving the 
efficient point, contrary to what has just been shown, i.e., that the value of the efficient 
output with this price line wiil be a maximum. 


price line have a negative slope. 
pe of the price line must be negative, 
the slope of a price line will be negative. 
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efficient output combination is a competitive output combination; at some 
set of prices competitive businessmen will produce it. 

This, then, proves the theorem about the efficiency of competitive 
equilibrium. There is some similarity between this proof and the proof of 
the Pareto optimality theorem, which will not be described in detail. 
Clearly the latter theorem must take into consideration consumer de- 
mands, and this involves a number of complications. In effect, what we 
want to prove is that every competitive equilibrium involves tangency 
between the transformation locus and some sort of consumers’ community 
indifference curve (an indifference curve which, in some sense, represents 
the tastes of all consumers taken together), and that any such point of 
tangency is a competitive equilibrium. For then society will have attained 
the highest state of welfare compatible with the available production pos- 
sibilities. The tangency can be assured by the tangency of both the trans- 
formation curve and the community indifference curve with the same 
price line. 

Here too we have a theorem on convex sets which comes to our assist- 
ance. The theorem states that if two convex sets meet only at boundary 
points, there will be at least o ^ line such that one convex set lies in one 
of the half-spaces generated by the line and the other convex set lies in 
the line’s other half-space. 

Consider all output combinations above and to the right of some in- 
difference curve, i.e., all the points which involve outputs that are pre- 

ferred to or indifferent with the outputs 

at some point, R, on the indifference 

curve QQ' (Figure 5). Such a preferred 
OUTPUTS PREFERRED point is defined in the Pareto sense as one 
WIS VEIEFSRENT that involves an output combination 

which can make some consumers better 
off than they were at R, without hurting 
anyone. It can be argued that the set of 
points preferred to or indifferent with R 
T X (shaded region in Figure 5) is convex. 
The set of feasible output points (the 
points in region OTT’ which lie on or 
beneath the transformation curve, TT”) 
ilia constitutes a convex set, at least in cases involving constant returns 
(linear programming) or diminishing returns (see Chapter 7, Sections 4 
and 5). There will then be a line, PP’, which separates these two regions. 
If, in addition, point R lies on the transformation locus, PP’ will be a 
supporting line for both sets and R will be a feasible efficient point which 
lies on the highest possible indifference curve. 


Figure 5 
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6. Dual Prices and Decentralized Decision Making 


Section 9 of Chapter 21 discussed a pricing scheme which was designed 
to achieve the results of central planning without detailed centralized 
direction. The idea is simply to establish such a set of prices that the in- 
dividual plant or company manager is forced to make the “right” decisions 
in order to maximize his profits. 

Some light can be thrown on the nature of this sort of arrangement 
with the aid of the analysis of the preceding section and the duality theo- 
rems of linear programming (Section 3 of Chapter 6). It will be recalled 
that one of the properties of the dual program is that the optimal values of 
its variables can be interpreted as accounting prices of the scarce resources 
in the primal production problem. Moreover, these accounting prices have 
the following properties: 

1. The cost of the scarce resources used by the firm when evaluated at 
these accounting prices will be exactly equal to the firm’s total profits. 

2. If the firm is charged these dual prices whenever it uses its scarce 
resources, any output or process which should, optimally, not be employed 
by the firm will actually involve the firm in a loss—i.e., only negative 
* profits can be earned on a commodity which the firm should, optimally, 

not be producing or on any process which the firm should, optimally, not 
be using. 

Suppose, now, that the economy’s production function is linear and 
homogeneous, and that the central authority were to look for a set of 
commodity prices and input prices which will lead individual businessmen 
to produce some efficient combination of outputs. As we saw in the previ- 
ous section, for every such output combination, Q, there exists some set 


: c ; Since the production function is, by as- 
sumption, linear and homogeneous, this will be a linear progr. i 


L | therefore have a dual whose solution will be 
a set of accounting prices, R, for Society's scarce resources. 
The planning authority 


for all final products; (2) se 


resource he will be required 
item for every unit he uses. 
In that case every business firm will find it un 


profitable to produce any 
outputs or to use any processes which are not 


included in the optimal 
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(efficient) output combination, Q, for by the standard properties of the 
dual prices every nonoptimal output or process will incur a loss. 

Moreover, if every businessman expands his output to capacity (as he 
can do without loss since the dual prices just permit zero profits), the 
result will be the production of exactly the efficient output combination, 
Q. For as we have seen in Section 3, with fixed input and output prices, 
businessmen can only maximize their profits by maximizing the value of 
outputs, and the officially enforced prices P have been chosen so as to 
make Q the output whose value is a maximum. Thus by choosing output 
prices P and dual input prices R, businessmen will be forced by the pursuit 
of self-interest to make decisions which are socially optimal. No detailed 
central output directives will be required to achieve this optimal result. 


7. Integer Programming and Welfare Economics 


Most of the preceding results hold for cases involving linear or other 
convex relationships (constant or diminishing returns to scale) and where 
there are no problems of indivisibilities (a steam shovel is indivisible 
because we cannot produce one-half or one-fourth of a steam shovel). 
However, the presence of indivisibilities and increasing returns complicates 
the situation, and we shall see now that in these cases things do not work 
out so well. 

These problems can be treated with the aid of integer programming 
analysis in which the answer is always required to contain no fractional 
parts. Its relevance to the indivisibilities case is clear since we desire an 
analysis which can avoid nonsense answers involving fractional parts of 
steamships, or drill presses. At least in principle, the increasing-returns case 
can also be reduced to an integer programming computation. Let us see, 
then, what follows for our welfare theorems from the integer programming 
analysis. 


As in ordinary linear programming, it remains true in the integer pro- 
gramming case that every value-maximizing (competitive) output will 
also be efficient. This can be shown by 
exactly the same argument as that of 
Section 5. Unfortunately, the converse 
does not hold. There may be efficient 
outputs which are not competitive, i.e., 
for which there exist no prices at which 
this output combination maximizes the 
total value of output. This is easily 
proved by counterexample, as shown in 
Figure 6 Figure 6. Here the shaded triangle, OBC, 
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contains all of the feasible lattice points (the points representing all 
possible solutions with no fractional coordinates). Point A, with coordi- 
nates (2, 1), lies in the interior of this triangle. But (because the feasible 
points are isolated) it is possible for such an interior point to be efficient. 
This is in fact the case with A, for there is no feasible lattice point which 
"dominates" A, i.e., no point which lies directly above it, directly to the 
right of it, or above it and to its right. 

Now consider any straight (price) line, such as PP’, through A. Any 
such line must lie below either lattice point B or lattice point C. This 
means that there must exist another parallel price line such as P'P'^ 
which lies above PP' and goes through one of these corners of triangle 
OBC. In otber words, in the case shown in Figure 6, at the prices involved 
in the price (iso-output-value) lines shown, the value of output at point C 
exceeds that at A. And, similarly, at any other possible set of output prices 
the value of output at A will be smaller than that at B or that at C. This 
Shows how, in the diserete programming case, there are likely to arise 
efficient outputs which are not competitive (value-maximizing) outputs 
and which therefore cannot be enforced by the standard type of decen- 
tralized control procedure of the economie literature, in which the central 
authority makes only simple price decisions. 

It is to be noted, however, that it is possible to find families of non- 
linear or piecewise linear price curves such as RR’ for which the value of 
output is maximized at A. This has a simple interpretation. The prices 
which are set up are discriminatory and vary with the magnitude of out- 
put. Output combinations which are close to A are given relatively high 
prices, but as outputs move farther and farther from A prices are made 
increasingly unfavorable to the seller so that there are sharply diminishing 
returns to departures from A. In other words, an output, tı, of any com- 
modity at A is broken arbitrarily into a sum of suboutputs 


tu tte t+ sss + hn = th 


and each of the suboutputs 6; 
described. Such an arrangement, 
ment fiat. But it is difficult to s 
trol procedure when it becomes 
never result from the spontane 
which preclude the existence 
homogeneous product. 

The so-called basic theorem of w 


is assigned a different price, P; as just 
could, in principle, be enforced by govern- 
ee much advantage to a decentralized con- 
80 complicated, and in any event it would 
ous operation of competitive market forces 
of different prices for different units of a 


: elfare economies runs into even more 
Serious trouble in integer programming. It is in this situation not generally 


possible to attain a Pareto optimal point by means of a price system. 
This is obviously so for the case of interior efficient points such as A in 
Figure 6, t » let RR’ now represent a community indifference curve so 
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that A is now the optimal feasible point. There obviously exists no line 
which separates the remainder of the feasible lattice points from the region 
socially preferred to or indifferent with A (the region above RR’). This 
means that with any fixed price arrangement producers will find it more 
profitable to manufacture either output combination B or C than to turn 
out the social optimum combination, A.” 

Let us summarize the results of this section: 


1. Every competitive output combination is efficient and any such 
point can be attained by a system of fixed prices set by central authority, 
all other decisions being left to the individual firms in the economy. This 
is no different from the result for the ordinary linear programming case. 

2. However, unlike the ordinary linear programming case, not every 
efficient output can be achieved by simple centralized pricing decisions or 
by competitive market pricing processes. 

3. Moreover, it is possible in the integer programming case that there 
exists no hyperplane which separates the feasible lattice points from those 
which are preferred to or indifferent with the optimal lattice point, In 
other words, there may exist no set of prices which simultaneously makes 
the optimal point, Q, the most profitable among those which can be pro- 
duced and the cheapest among those which consumers consider to be at 
least as good as Q. That is, at any set of prices either producers will try 
to make, or consumers will demand, some other output combination. 


It should be observed, in conclusion, that these limitations on the 
price system in the integer programming case should not be entirely sur- 
prising. For, as has already been indicated, cases of increasing returns to 
scale can, at least in principle, be reduced to integer programming problems. 
And in such cases it has long been recognized that the price system runs 
into difficulties. 
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Theory of Distribution 
24 


The theory of distribution deals with the determination of the 
levels of payment to the various factors of production—the prices of the 
economy’s inputs. Since general equilibrium analysis seeks to account for 
the determination of every price in the economy, it includes the pricing of 
inputs within its scope; that is, the analysis of distribution must ultimately 
be considered a segment of general equilibrium theory. Since a change in 
the level of wages, interest rates, or rents has significant ramifications 
throughout the economy, the general equilibrium aspects cannot easily 
be ignored. 

In this chapter, there is no attempt at a systematic discussion of the 
three traditional input categories, land, labor, and capital. So simple a 
breakdown is somewhat out of fashion, since each of these categories 
includes within it so huge a variety of heterogeneous elements. It is often 
not helpful to treat coal, cloth, and a drill press as one homogeneous 
element—capital. Nevertheless, the categories still retain considerable 
convenience as shorthand analytic devices, and they will be used where 
they prove handy. 


1. Controversies in the Theory of Distribution 


Recently the theory of distribution has become the subject of a rather 
spirited debate between writers associated with Cambridge, Massachu- 
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setts, and Cambridge in the United Kingdom (including a group of Italian 
economists associated with the latter). Much of the argument has centered 
about fairly technical topics in the theory of capital and, accordingly, it 
will be considered later, in the chapter on Capital and Distribution. But 
the broad subject of the discussion can serve as a useful introduction to 
the materials of this chapter. 

The basic issue is what one can reasonably expect from distribution 
theory. Three possibilities come to mind: (a) One can hope for a sophis- 
ticated analytical (micro) model which passes various tests of logical 
consistency and completeness and whose complexity reasonably reflects 
the complexities of the real world; (b) one can want a tractable macro 
model which rests on some well-chosen simplifications that exact only a 
small cost in distortion of reality but in return offer us a firm intuitive 
grasp of the main relationships and a reasonable basis for policy design; 
(c) one can aspire to a model which enables us to evaluate the fairness and 
morality of the distributive arrangements in our own or some other sotiety. 

We can immediately and most firmly dispose of the last possibility. 
Value judgments do not emerge from a behavioral model without first 
being put into it. We may all agree that more equality in income distribu- 
tion is a good thing, but that is a view which does not emerge from our 
theory. Adam Smith and Ricardo could have espoused the same distribu- 
tion model (though in fact they did not) but Smith in the spirit of the 
eighteenth century might have considered a call for more equality to be 
quixotic, while Ricardo, as an early liberal of the nineteenth century, 
might have taken it to be a goal which was obviously desirable but difficult 
to attain. : 

Few economists have argued that our theory tells us anything about 
justice in distribution, though there have been several exceptions. At the 
eud of the niueteenth century J. B. Clark concluded from marginal 
productivity theory that “the different classes of men who combine their 
forees in industry have no grievances against each other," and that the 
distributive arrangements under pure competition “‘. . . treat men fairly" 
and ure *... determined by a principle that humanity can approve and 
perpetuate” because it “gives to every agent of production the amount of 
wealth which that agent creates".! Similarly, some of the analytically 
weaker of the utopian -socialists sought in the theory of distribution a 
basis for the condemnation of capitalism. But with these exceptions such 


‘John Bates Clark, The Distribution o; Wealth, The Macmi t 
York, 1899, pp. v, 7-8. A i ii i Daca 


572 Theory of Distribution Chapter 24 


a quest has elicited the universal derision of modern economists” and even 
that of Karl Marx.? 

There is good reason to reject marginal productivity analysis as 
evidence of the virtue of the distributive arrangements. Even if Nature 
does contribute a marginal product that is equal to the rent of land, is 
there any inherent virtue in this sum going to the landlord acting as a 
representative of Mother Nature in absentia? 

Let us consider next the macroeconomic models of distribution. Such 
models lump together large numbers of moderately diverse economic 
variables and relationships and treat the resulting aggregates as homo- 
geneous economie elements. In this way, manageable models involving 
small numbers of variables and relationships are obtained. Some violence 
is always done to the facts in the process of aggregation. For example, the 
Statement that the labor market is in equilibrium when the total effective 
demand for labor equals the total supply can conceal serious difficulties of 
oversupply in some industries and shortages in others. One must therefore 
seek fruitfulness rather than rigor in a macroeconomic model. A completely 
formalistic macro model is likely to be the worst of both worlds because 
it is apt to offer neither empirical insights nor an accurate analytic mech- 
anism. 

At least two such models are available and have been subjects of 
considerable discussion. The first is the classical theory of David Ricardo 
and the second is the model of Nicholas Kaldor, which can be taken as an 
example of the approach associated with Cambridge University.* Accord- 
ingly, this chapter will provide a review of these discussions and of some 
of the criticisms to which they have been subjected. 


? Samuelson, for example, has felicitously described the imputation of virtue to 
payments based on marginal productivity as making “. . . idols of partial derivatives," 
Foundations of Economic Analysis, Harvard University Press, Cambridge, Mass., 1948, 
p. 225. 

3 See, e.g., Das Kapital, Vol. I, Chapter VI, Kerr edition. “The value of labour-power is 
determined, as in the case of every other commodity, by the labour-time necessary for 
the production, and consequently also the reproduction of this special article. . . . It is 
a very cheap sort of sentimentality which declares this method of determining the value 
of labour [in terms of the labor cost of production of subsistence], a method proscribed 
by the very nature of the case, to be a brutal method . . ." (Kerr ed., pp. 189, 192.) 

* The reader may wonder why there is no discussion of a Marxian distribution 
nodel. However, it can be argued that Marx never formulated any such model and that 
ie deliberately avoided doing so. He considered his analysis to be related primarily to 
rroduction. (Thus, Volume I of Das Kapital, the only one of the three volumes completed 
y Marx, is entitled T'he Process of Capitalist Production.) Marx certainly held little 
rief for work that separates out distribution as a subject to be studied by itself. See, 


un 
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Finally, there is the theoretical micro model of distribution, the marginal 
productivity analysis which is the basis of most theoretical writings on the 
subject. With all of its restrictive assumptions, most notably those of 
universal perfect competition and stationary equilibrium, no one claims 
that it is a very accurate representation of the facts. What is claimed is 
that it describes a consistent mechanism which bears at least some re- 
semblance to the workings of our economic institutions and that embodied 
within its general equilibrium relationships there are the forces which 
determine the payments going to laborers, capitalists, landlords, ete. It 
used to be thought that these complex relationships in fact followed certain 
simple patterns at least roughly and that from these patterns one could 
safely formulate intuitive generalizations and draw conclusions relevant for 
policy. It has been the contention of the Cantab(Cambridge)-Italian 
school that no such generalizations are possible—that any simple con- 
clusions drawn from the general equilibrium models will er.counter s0 
many exceptions of such significance that they become untenable. The 
right question, it would then seem, is not whether a marginal productivity 
analysis (with suitable modifications) is invalid or logically defective. 
Rather the issue is the degree to which it is useful. 


2. On the Marginal Productivity Theory 


The partial equilibrium elements of the marginal productivity theory 
are implicit in our discussions of the theory of the firm and the theory of 
production in earlier chapters. Under pure competition the profit-max- 
imizing firm will hire any input up to the point where its wage equals the 
value of its marginal product, that is, to the marginal physical product of 
the input, multiplied by the money price of the product, for if the marginal 
value of the product exceeds the price of the input, the firm can, by 
definition, increase its profits by acquiring more units of the input since 
additional units bring in more to the firm than they cost. The reverse will 
be true if the price of the input exceeds the value of its marginal product. 5 


e.g., Karl Marx, Grundrisse, M 
worth, 1973, pp. 87. 94-98. 
5 Mathematically, in the Single product firm, w] 


artin Nicolaus (ed., trans.), Pelican Books, Harmonds- 


3 hose production function is y = 
S(21,+++, tn), where 2; is the quantity of its ith input, profit is given by 
Moi dr es Datta St Bae 5, ced ee nes Pus 


Where p is the price of a unit of output and P: the price of a unit of input i. Hence 
profit maximization requires : 
Ov/Oz; — 0 or pdy/dx: = pi, 

which is the result in the text. 
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The reader will recognize this as the usual argument behind any of the 
marginal conditions of equilibrium in any economic problem. 

Where pure competition does not hold and product price varies with 
output (a negatively sloping demand curve for the products of the firm), 
the return to the input will be equated to its marginal revenue product, 
where marginal revenue product is equal to price times marginal physical 
product minus the loss in revenues to the firm that results from the fact 
that increased production forces it to reduce its price on everything it sells.9 
If, for example, a firm has been producing 100,000 units of some product 
which sells at $5 per unit and an additional machine can produce an 
additional 32,000 units of output which, when dumped on the market, 
reduce the price to $4, the net effect is the following: It has added to the 
firm’s revenues 32,000 units at $4 each, but this is partly offset by the $1 
reduction in earnings on each of the remaining 100,000 units sold, making 
a net gain (marginal revenue product) of $128,000 — $100,000 = $28,000. 

In terms of Figure 1 if DD’ is the firm’s demand curve, suppose an 
additional unit of input increases output from y; to y» (marginal physical 
product = ys — ya). Then the value of the marginal product is equal to 
price multiplied by y» — ya, i.e., to the heavily shaded area in the diagram. 
But this is not unadulterated gain to the firm which finds that simul- 
taneously its product price falls by pa — ps, causing it a loss on its initial 
output represented by the lightly shaded area. The net revenue contributed 
by an additional unit of output, then, is the difference between the two 
areas, and that is the input’s marginal revenue product. The firm will hire 
the input until its wage is equal to that marginal revenue product." 

The main point is that the marginal productivity analysis as it has been 
described up to this point has served to help us to determine the firm's 
derived demand for any given input. It shows how the quantity of the input 
demanded by the firm will vary with the input’s price and makes it clear 


* In terms of the discussion of the preceding footnote, we are no longer assuming 
that product price, p, is a constant but that it is a function of output, y. Consequently, 
by the chain rule of differentiation (Chapter 4, Section 2, Rule 8), the profit max- 
imization requirement becomes 


ðm/ðzı — 0 or pOy/Oz;-- (ydp/dy)0 y/0 zi = pi, 
thus adding the term (ydp/dy)9 y/ð x, to the requirement of the preceding footnote. The 
value of this term is normally negative since dp/dy is the reciprocal of the slope of the 
demand curve. ydp/dy is, then, the loss in total revenue resulting from the reduction 
in price that accompanies the rise in output. 

7 If the price of the input is affected by the quantity hired by the firm, this rule must, 
again be amended in an obvious manner to state that in equilibrium the marginal cost 
of the input to the firm will equal its marginal product. But this and other such com- 
plications are peripheral to our interest here. 
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Figure 1 


that, for a profit-maximizing firm, this demand relationship depends 
directly on the demand for the final product as well as the input’s marginal 
contribution to output. 

However, so far it is not at all a model of input price determination. It 
takes those prices somehow to have been determined outside the model 
and asks how much of each input will be hired in response. Thus, so far, it 
is an analysis of employment by the firm, rather than a model of wages and 
other input prices. 

The determination of the input prices themselves requires one more 
step, but a crucial one. We now have obtained from the marginal produc- 
tivity analysis a demand function for every input by every firm. We can 
take this information, along with the corresponding supply information 
for each input (whether purchased from another firm or a private individual 
like a worker selling his labor power), and embed it, along with the supply- 
demand information for every good in the economy, into a giant Walrasian 
model of general equilibrium. From this machinery we can assume that a 
set of equilibrium prices and quantities will emerge for every item in the 
economy, including wages for different types of labor, rents for different 
qualities of land, etc. This then is the marginal productivity model of price 
determination. 

Unfortunately, no partial model will be able to do its job because if 
there is any place in the economy that interdependence of different 
activities manifests itself, it is in the market for inputs. A rise in wages in 
industry A soon enough affects labor costs in industry B. A rise in the 
price of fuels affects the relative demands for other inputs, depending on 
their comparative economy in fuel use. 


Thus, we are left with the suspicion that the analysis, with all its 
assumptions, is fundamentally valid but perhaps not so illu 
one might wish. To deal with this difficulty, 
tradition have sought to ag 


tivity model, permitting it 


minating as 
economists in the neoclassical 
gregate large sectors of the marginal produc- 


to retain its general equilibrium character but 
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cutting its scope down to two or three homogeneous inputs. In particular, 
models have been constructed containing only labor and capital, and 
qualitative conclusions have been derived from them. But this procedure, 
collapsing the vastly heterogeneous set of capital inputs into one artificial 
factor, has seemed particularly objectionable to the Cantab-Italo econo- 
mists. This point will be discussed further in the chapter on capital and 
distribution, where more of the general equilibrium character of the mar- 
ginal productivity theory will emerge. 


3. Marginal Productivity, Cobb-Douglas Functions and the Constancy of Labor's 
Share 


As illustrations of the workings of marginal productivity analysis, we 
offer brief discussions of two subjects that are of interest both as curiosa 
in the history of economie thought and as introductions to the role in 
distribution theory of the Cobb-Douglas production function and Euler’s 
theorem on homogeneous functions. 

There is a fair amount of empirical evidence that has been interpreted 
to assert that the share of wages in the national income of the United 
States has remained relatively constant for as long a period as is covered 
by our records. There have been a number of attempts to explain this 
apparently remarkable fact. One proposed explanation of the relatively 
fixed proportion between wage payments and total national income is 
based on the hypothesis that the production function takes the special 
form of a Cobb-Douglas function (see Chapter 11, Section 11) 


y= kL°C"- 


where k and a are positive constants (and a < 1). Here y represents total 
national output (income), L is the quantity of labor input, and C is the 
quantity of capital employed. It is easy to show, with the aid of some simple 
differential caleulus, that if labor is paid a wage equal to its marginal 
product this production function will yield a share of wages relative to 
total output which is fixed and independent of the values of the variables, 
y, L, and C, as the empirical evidence seems tc indicate. In fact, the ratio 
between total wage income and total output, y, must in these circumstances 
be exactly equal to a, the exponent of L in the Cobb-Douglas production 
function. A similar result must apply to the income of capital. 


8 For a sophisticated recent discussion, see Melvin W. Reder, “Alternative Theories 
of Labor’s Share," The Allocation of Economic Resources: Essays in Honor of Bernard 
Francis Haley, by Moses Abramovitz and others, Stanford University Press, Stanford, 
Calif., 1959. See also Robert M. Solow, “A Skeptical Note on the Constancy of Relative 
Shares," American Economic Review, Vol. XLVIII, September 1958. 
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Proof: The marginal product of labor, dy/dL, is found by differentiation 
of the production function to be akL(^^! C0—9, Since this is the wage 
per worker, total wage payments must equal this amount multiplied by 
the number of workers, L; i.e., the wage bill must be 


LakL*-0ca-o = akL*Co-® = ay. 


That is, total wages equal a times total output, as was to be proved. 
Unfortunately, the argument fails in its central purpose. Its objective 
is to explain why the share of wages should have remained constant despite 
vast technological change. But it is just as difficult to see why with such 
technological change the exponent, a, in the production function should 
not have varied, which is the fundamental assumption of the preceding 
discussion. The argument, then, proposes to explain a constant wage share 
with the aid of a constant, a, for whose constancy it offers no explanation! 


It is to be observed also that there is no a priori reason for accepting 
the validity of the Cobb-Douglas production function as an accurate 
depiction of the technology of the entire economy. It is merely an em- 


pirical hypothesis which has been proposed to explain an empirical 
observation.? 


4. Euler's Theorem and the Adding-Up Controversy 


Our second application of marginal productivity analysis has its roots 
in earlier discussions of distribution theory. When the marginal produc- 
tivity theory first achieved acceptance just before the turn of the century, 
as we have already noted, some economists attempted to use it as a basis 
for showing that the distribution of income under free competitive capital- 
ism must be morally just. In the course of the discussion there arose another 
question. Suppose every productive input is paid the value of its marginal 
product. Does this mean that the entire product will always thereby be 
handec out to those who worked on it, or may something be left over to 
fall into the clutches of an exploiter? Indeed, is there always enough on 
deposit in the production bank to pay out all of these marginal product 
claims or might there even be a deficit? It became important to the discus- 
sants to show that the sum of the marginal products added up to exactly 
the total output—that there was neither surplus nor deficit left at the end. 


———— 


i See Robert M. Solow, “Technical Change and the Aggregate Production Function,” 
Review of Economics and Statistics, Yol. XXXIX, August 1957, and Reder, op. cit. For 
evidence suggesting that an aggregate production function may not be of Cobb-Douglas 


form, see K, J. Arrow, H. B. Chenery, B. S. Minhas, and R. M. Solow, ‘“Capita]-Labor 


Substitution and Economic Efficiency,” Review of Economics and Statistics, Vol. 33, 
August 1961, pp. 225-50. 
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Here the linearly homogeneous production function again came to the 
rescue. As was proved in Chapter 11, Section 10, there is a standard 
mathematical result, called Euler's theorem, which tells us that if a produc- 
tion function involves constant returns to scale, the sum of the marginal 
products will actually add up to the total product. That is, if each input 7 
is paid p; = p dy/dz;, the value of its marginal product, we must have 


py = È tp dy/dx; = DY pai. 


Wicksteed's!? injection of homogeneous linear production functions 
into the discussion opened a long controversy over the plausibility of the 
hypothesis that the production function will take this form in practice. ' + 

As for the controversy itself, as Samuelson pointed out,!? the discus- 
sion really seems to have missed the point. Whether there-are any profits 
of exploitation left over for the capitalist to haul in is really a matter of 
market conditions. For example, as we have seen, in the long run under! 
perfect competition prices of outputs and inputs will settle toward levels 
at which there is nothing left over for payment to the entrepreneur in 
excess of his managerial wages and interest on his capital, but under 
monopoly there will normally be profits in excess of this amount. The older 
Euler's theorem discussion abstracted entirely from the product and input 
markets in which competitive pressures, if anything, will rob the exploiter 
of the fruits of his exploitation. 

But how does this conclusion square with Euler's theorem which says 
that marginal products add up to the total product when and only when 
the production function is linearly homogeneous?!* The answer is implicit 
in a solution that was first proposed by Walras and then rediscovered by 
Hicks. They showed that, whether or not the production function is 
linearly homogeneous, in the vicinity of a competitive equilibrium point it 
must be locally linearly homogeneous, that is, all of its variable values and 


10 See Philip Wicksteed, The Coordination of the Laws of Distribution, Macmillan & 
Co., Ltd., London, 1894. Actually, it was A. W. Flux who, in his review of Wicksteed 
(Economic Journal, Vol. IV, 1894), explicitly injected Euler’s theorem into the discussion. 

11 Edgeworth’s much quoted comment merits repetition here: “There is a mag- 
nificence in this generalization which recalls the youth of philosophy. Justice is a perfect 
cube, said the ancient sage; and rational conduct is a homogeneous function, adds the 
modern savant." Collected Papers Relating to Political Economy, Macmillan & Co., Ltd., 
London, 1925, Vol. I, p. 31 (the remark was first published in 1904). 

1? Foundations of Economic Analysis, Harvard University Press, Cambridge, Mass., 
1948, pp. 83-87. 

13 [f the production function is homogeneous of some degree r # 1 (nonconstant 
returns to scale), Euler's theorem (Chapter 11, Section 10) tells us that rpy — 
Laxipdy/dx: = Y zipi so that E z:p;, total payment to the firm's inputs, will not 
equal Dy, the value of its total output. One can prove a comparable result for a produc- 
tion function that is not homogeneous of any degree. 
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derivatives must be the same as those of a linearly homogeneous function. 
Thus, at that point all of the marginal products (the partial derivatives 
dy/dx;) must coincide with those of a linearly homogeneous function, and 
so they too must satisfy the Euler’s theorem condition. 

To show why the production function must be locally linearly homo- 
geneous in competitive equilibrium we note first that the simple function 


(1) C= Pid, + Pore + +++ + pata 


" must be linearly homogeneous in the input quantities z;,---, x, since if 
each x; is multiplied by k then c will obviously also be multiplied by k 
and that is just what we mean by linear homogeneity. 

Now Equation (1) is simply the total cost to the firm of the collection 
of inputs £1, ***, z,, and its graph is a (hyper)plane through the origin, 
CacsceCa in the two-input case represented in Figure 2. The oddly shaped 
surface (shaded area) in the diagram represents the production function 
(or rather the value of output) py = pf(vi, x2). If the second-order con- 
ditions hold at the point of equilibrium, 7’, the two surfaces must be tangent 
there, since the zero-profit requirement assures us that at no combination 
of inputs and outputs will the value of outputs exceed the cost of the 
corresponding inputs, and at the equilibrium point the two will be equal. 
This gives us our result, for the tangency of the two surfaces at T means 
that there they will both have the same derivatives. In other words, 
pf(x1, 22) must, indeed, be linearly homogeneous locally at T, as was to be 
shown. Euler’s theorem must therefore apply, and the payment to each 
input of its marginal product must exhaust total product. 


VALUE OF 
OUTPUT 


Figure 2 
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5. Alternative Distribution Theories, I: The Ricardian Model 


Having discussed the structure and some applications of marginal 
distribution theory, we may note once again that in terms of logical con- 
sistency and formal completeness it has no equal. However, it has told us 
very little about the burning issues to which one might hope distribution 
theory will address itself, issues such as the magnitude of the income gap 
between the poor and the wealthy and its relationship to their role in the 
productive mechanism. For contrast, we therefore summarize briefly two 
macro models which are intended to come closer to dealing with such 
topics—Ricardo’s analysis and the more recent Kaldor model. 

Ricardo’s distribution theory is still perhaps the one model which 
assigns distinctive roles to different economic classes, which deduces well- 
defined trends in their earnings, and which offers clear-cut policy implica- 
tions. While few economists would concede that it constitutes a good 
description of a complex industrial economy, it has been held that its 
implications still retain some relevance at least for some less developed 
economies. 

Ricardo’s analysis rests on four central components: diminishing 
returns to labor utilizing a fixed quantity of land, the theory of rent, the 
tendency of universal competition to equalize returns to investment, and 
the Malthusian population principle. Though the roles of these four 
elements are interdependent, we must consider them individually. 

The principle of diminishing returns has already been examined in the 
chapter on production. In sum, it asserts that as we increase the use of 
some inputs holding the quantities of others constant, the average and 
marginal yield of the expanding inputs must eventually fall. In classical 
theory this was applied to the increased use of labor (with growing popula- 
tion) and of capital (the product of capitalists’ accumulation) on society’s 
fixed supply of land, “the original and indestructible powers of the soil."! * 
This can take the form of a decreasing yield either to investment of labor 
and capital on a given piece of land (their intensive yield) or to investment 
on successive and successively inferior pieces of soil (their extensive yield). 
In either case, the process of competition and the mobility of capital will 
in the long run tolerate no difference of return to different investments. 
The payment to landlords on higher-yielding units of investment will be 
bid up by competition to the point where all units provide the same net 
return to the investor. Thus the landlord will receive as rent the difference 
between the investor’s return on his least productive investment (i.e., his 
return on the margin) and the higher return on all other units of invest- 


14 David Ricardo, On the Principles of Political Economy and Taxation, Sraffa ed., 
Vol. I, Cambridge University Press, New York, 1951, pp. 67, 71-72. 
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ment. If three units of output have, respectively, marginal costs of $3, $7, 
and $9, then, as illustrated in Figure 5a, the first will earn a rent of $6, 
the second a rent of $2, so that all three will end up with a unit cost, in- 
cluding rent, of exactly $7.15 

This already describes the mechanism for the determination of one 
distributive share—that going to landlords. In the Ricardian model profits 
are determined residually as what is left over from production after pay- 
ment of rent and wages. Consequently, we only have yet to discuss the 
wage determination process. For this we still lack one component of the 
analysis, the Malthusian population theory, which can be taken to assert, 
in oversimplified form, that whenever wages exceed some amount called 
“the subsistence level" people will marry earlier, and consequently re- 
produce more rapidly, thus eventually tending to increase the labor force 
to a point where diminishing returns and the competition of workers 
combine forces to drive wages back to the subsistence level. 


All the pieces of the dynamic mechanism of the Ricardian distribution 
model are now in place, and its workings can be described very briefly. 
Capitalists, stimulated by high profits, are induced to save substantial 
portions of their earnings, which they invest as a means to earn still more 
profits. The increased investment demand brings with it a rise in demand 
for labor. That in turn raises wages and induces growth in population. In 
this process profits are eroded, and consequently investment is impeded by 
two forces—the rise in wage rates induced by the investment itself and 
diminishing returns, which are brought into play by inereased population 
and expanded economie activity. The wage rise may only be temporary 
since the expanded population and its increased labor force ultimately 
depress wages once more. However, diminishing returns do not disappear 
by themselves. Unless there is technological progress on a seale sufficient 
to offset them, profits will suffer a persistent and cumulative erosion. This 
is the classical variant of the famous law of the falling of profit. 

This process is illustrated in F igure 3, which shows the behavior of 
population, wages, rents, and output with the passage of time, on the 
assumption that there is a constant ratio between the size of the population 
and the size of the labor force. The curve OY shows how total output 
varies with the size of the labor force, with the curve leveling off toward the 
right as a result of diminishing returns to additional labor. 


—— M 


. 15 For details of the argument see Section 10 of this chapter. It is ironic that this 
Ricardian rent theory was neither discovered by Ricardo nor claimed by him. Ricardo 
attributes the idea to Malthus but it can also be ascribed to Edward West, and perhaps 
to James Anderson some forty years earlier. See Sraffa’s Ricardo, Vol. IV, pp. 3-8, and 


Joseph Schumpeter, History of Economic Analysis, Oxford University Press, Inc., New 
York, 1954, pp. 263-65. 
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Figure 3 


Total rent payments increase steadily with population growth and the 
resultant increase in land use. Therefore, curve Y — R, that is, total 
output minus rent, also levels off as we move toward the right, i.e., as 
population grows. Y — R gives the portion of output available for division 
between wages and profits. Finally, the line OS shows how much output is 
required to pay every member of the population a fixed subsistence wage 
s. Since the expression for this curve is S = sN, where N is population 
size, this is a straight line through the origin. 

In exaggerated form we may then take the dynamics of the classical 
distribution process to proceed somewhat as follows: Assume that popula- 
tion is initially No and that the rate of accumulation is initially high so 
that the level of wages is bid up to a point where it absorbs most of non- 
rent output (point Wo), and is well above the subsistence level, NoSo. This 
will encourage population to grow to N; at which the wage payment covers 
no more than subsistence, N,S,;. At this point profits will be high (S1W;), 
inducing increased accumulation and pushing total wages upward once 
more, this time toward W,. The process repeats itself, moving roughly in 
the sequence of steps Wo9S1W182W» - -- toward.point T, the point where 
output after rent is just equal to the requirements of subsistence. As 
population approaches N, the level corresponding to point T, the economy 
approaches its stationary state; at this, in the absence of new technology 
or other exogenous events, profits, accumulation, and population growth 
remain forever at zero, wage payments remain forever at subsistence, and 
rents remain at their maximum attainable level, T'R. 

Thus, in this process workers may gain little, and capitalists almost 
certainly lose out. The sequence of events also includes a diminishing 
propensity to accumulate as the rewards of accumulation decline, with the 
concomitant loss in the forces making for (at least temporary) wage in- 
creases. The landlords are the only ones who benefit from this dynamic 

process, gathering ever-increasing vents as the demand for their land 
grows with rising population, and inferior lands are brought into produc- 
tion, thus raising the rents on the better pieces of land. In this process, 


————————— Na 
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then, the interests of landlords are diametrically opposed to those of every 
other group. Workers and capitalists are not natural allies for they must 
compete for the remainder of the output of the economy, the portion not 
claimed by landlords. Yet workers and capitalists do have a common in- 
terest in the success of the accumulation process, as we will note again 
presently. 

Two qualifications relating to the wage determination process are 
crucial for the policy implications of the Ricardian model: First, “sub- 
sistence" must not be taken to be some absolute bundle of consumption 
goods which constitutes the border line of starvation. On the contrary, 
the “subsistence” level depends on accepted standards of living: 


“It is not to be understood that the natural price of labour, estimated 
even in food and necessaries, is absolutely fixed and constant. It varies at 
different times in the same country, and very materially differs in different 
countries. It essentially depends on the habits and customs of the people 

. many of the conveniences now enjoyed in an English cottage, would 
have been thought luxuries at an earlier period of our history" (Principles, 
Sraffa ed., p. 97). 


Second, when the demand for labor is sufficiently high, wages can be 
kept for indefinite periods of time above even this flexible ‘‘subsistence”’ 
level: 


"Notwithstanding the tendency of wages to conform to their natural 
rate, their market rate may, in an improving society, for an indefinite 
period, be constantly above it; for no sooner may the impulse, which an 
increased capital gives to a new demand for labour, be obeyed, than 


another increase of capital may produce the same effect . . .” (Principles, 
p. 95). 


There is, then, no “‘iron law of wages.”!® It is important to understand 
that these two features of the wage determination process are no minor 
qualifications to his theory. Rather they are critical elements in its policy 
implications, for to Ricardo several conclusions followed immediately : 


(a) The best way to-contribute to the welfare of labor is to increase 
the standard of living, which serves as the acceptable minimum “sub- 
sistence level" below which the population will fail to reproduce itself. In 


-—— € 


E 1° The notion of an iron law is neither Ricardian nor Marxian. Indeed, Marx em- 
WES R points which have just been made. See, e.g., Das Kapital, Vol. I, p. 190, and 
apter XXV, Section 1. The notion of an iron law of wages is associated with Marx’ 


nemesis, Ferdinand LaSalle. For Marx’ scornful view: $ 
2 t 8 on th i iti 
of the Gotha Program, Section II. fanfort, wes D Critique 
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other words, in the long run the most effective way to raise real wages is to 
decrease population growth, and a good way to achieve this is through a 
rise in the living standards considered necessary before embarking on the 
raising of a family; 

(b) A second best procedure for raising workers’ incomes is anything 
that encourages the continued accumulation of capital, which raises the 
demand for labor and hence bids up wages; 

(c) Free trade is desirable because it reduces the cost of the ‘‘sub- 
sistence" of workers and decreases rents by reducing the demand for land 
within the country. Together these two influences augment the return to 
capital, stimulate accumulation, and raise wages; 

(d) Payments to the poor must in the long run depress average real 
wages because they induce population to expand and thereby benefit no 
one but the landlord. They must hamper accumulation and decrease the 
marginal product of labor. Thus, though they appear to serve the interests 
of the poor, in the long run they work to their disadvantage. 

Today the Ricardian model is generally considered a gross over- 
simplification. Population growth is not determined so mechanistically. 
The evidence on the proposition that payments to low-income groups 
stimulates their reproduction is hardly clear-cut, and it seems often to work 
the other way. Technological change permits real wages to rise even as 
population grows. Wage bargaining is a complex process affected by 
unionization and other institutions. Monopolistic elements and govern- 
ment agencies play more of a role than they are assigned in the classical 
analysis. Yet in spite of all this one comes away from the analysis with 
more than a little admiration for its comprehensiveness, its logical strength 
and its ability to deal with the workings of a complex set of phenomena 
with the aid of a simple and suggestive structure. Certainly Ricardo 
provided us with a macro model whose construction must be considered a 
major intellectual feat. 


6. Alternative Distribution Theories, Il: The Kaldor Model 


The Cantab-Italo approach to distribution derives its roots both from 
Ricardian analysis and from Keynesian theory. A noteworthy example of 
its analysis is the Kaldor macroeconomic model, whose primary aim is 
to analyze the share of wages in the total national product. 

The basic premise of the model is that workers and capitalists save 
different proportions of their incomes. Consequently, given the level of 
(full employment) investment and total income there will be only one 


17 See Nicholas Kaldor, “Alternative Theories of Distribution," Review of Economi 
Studies, Vol. XXIII, No. 2, 1955-56. i ag 4 
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proportion between workers’ and capitalists’ shares of that total income at 
which total saving will equal total investment, i.e., at which the total 
demand for output will equal its total supply. 

The level of employment is, as usual in Keynesian analysis, taken to 
be a function of national output, Y. This level of employment, f(Y), 
multiplied by the wage rate, W, is the total payment of wages, W - f(Y). 
The rest of output, Y — W - f(Y), is then the income which accrues to 
other classes of income earners. 

Assume now that workers save a smaller proportion of their incomes 
than do other economic groups—say that workers save the proportion k 
of their incomes, whereas the corresponding figure for the rest of Society 
is k*, where, by assumption, k « k*. Total desired saving will, therefore, 
be equal to that of the workers, kWf(Y), plus that of the nonworkers, 
k*[Y — Wf(Y)]. If I is a given level of investment, equilibrium is deter- 
mined by the condition that desired saving be equal to the level of invest- 
ment, i.e., that 


(2) kWf(Y) + k*(Y — Wf(Y)] = I, 


where I, k, and k* are assumed to be known constants. If for Y we sub- 
Stitute the full employment level of output, Y; this becomes a single 
equation with one unknown, W, which can be solved for the equilibrium 
level of wages, W.. 

The analysis suggests an interesting policy conclusion. Suppose that at 
some other wage rate equilibrium national income is below the full employ- 
ment level and that the employment function, f(Y), is independent of the 
level of wages. Then a rise in wage level will not depress the demand for 
labor. On the contrary, it will transfer income from a group of low spenders 
to a group of high spenders so that total effective demand and hence 
employment and the level of national income must all rise!!® The moral is, 


18 Proof: Since consumption equals income minus saving, 
amount AW will raise workers’ expenditure to (1 — k)(W + A 
consumption expenditure will change to 


a rise in wage level by 
W)f(Y), and nonworkers’ 


( — k*)Y — (W + AW] 


80 that total consumption demand will have changed from 


] A = KWAY) + 0 — k*)Y — wr 
o 
G — EV + AW)(Y) 4- kY — (W + AW)f(Y)]. 


sd Subtraction, we see that demand will have chan, 


= ged by (k* — k) AWf(Y), that is 
ective demand must have risen, since k < k*. ! 3 
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apparently, that during a period of depression a wage rise is likely to be a 
good thing and may produce at least part of the income necessary to pay 
for it. Boulding has called such a construct a “widow’s cruse" model, 
after the legend of the widow who found that emptying her pitcher only 
filled it up again. Here the payment of higher wages out of national income 
helps to produce the wherewithal to pay them by increasing demand and 
therefore improving business receipts. 

Another curious implication of the model is that in Kaldor's world 
capitalists can always increase their share of income by increasing their 
spending, i.e., by reducing their savings rate, k*, to any point short of the 
level where it reaches that of the workers, k, for suppose total desired 
saving was previously equal to investment, so that after the decline in k* 
desired saving is less than investment. If k* is very close to k a given 
transfer of income from workers to capitalists will add very little to total 
saving, and so it will require a large transfer to the capitalists to produce 
the equilibrium in which desired saving matches investment.!? Thus, 
capitalists will find that the more they spend the more they have, so that 
the capitalists then have access to « widow’s cruse of their own. 

The model has been the subje: -i considerable criticism. For one thing, 
it is never made clear why the economy in this model has an automatic 
tendency to approach the leve? of full employment. For another, the 
premise that employment depends only on output and not on wage level 
denies that higher wages will induce the adoption of labor-saving inven- 
tions. Tobin has suggested that the capitalists’ widow's cruse is in fact a 
demonstration of the weakness of the theory. In a limiting case it permits 
them to capture all of GNP by spending enough of it for themselves! 

It really is not easy to believe that this simple model encompasses all 
there is to the determination of labor's share or even most of the primary 
influences in this process. Yet it does offer some suggestive ideas, and it 
certainly represents a fascinating attempt to produce an analysis which 
lends itself more readily to interpretation in terms of policy than the 
general-equilibrium analysis with all of its complexities. 


7. Backward-Rising Input Supply Curves: Labor and Saving 


Having discussed some of the main theoretical approaches to the 
theory of distribution, we turn now to some special topies, most of them 
relating to particular categories of input. 


19 Proof: By (2), (k — E*)Wf(Y) + k*Y = I, so that total wage earnings equal 
Wf(Y) = (I — k*Y)/(k — k*) and total profits, r = Y — Wf(Y) = [Y(k — k*) — 
T+ k*Y]/(: — k*) = (kY — D/G& — k*). Since k* » k by assumption, this will be 
positive if Z > kY. Then as k* moves toward k so that the value of the denominator 
falls, total profits must rise. 


Part 4 f Theory of Disrribution 587 


In Section 2 we saw how the marginal productivity model provides us 
with derived demand functions for inputs. It is appropriate now to offer 
some remarks on the supply relationships. Because these are so dependent 
on particular institutional arrangements in the markets for the different 
inputs, no general pronouncements on this subject are possible, but a 
number of observations about particular types of input can be fruitful. 

Most inputs are supplied by business firms. That is obviously the case 
with coal, iron, oil, lumber, and many other items. Given their demand, 
the analysis of the supplies of these items is therefore identical with that 
of the determination of any output level in the theory of the firm. The 
discussion of Chapters 11-16 applies here without change. 

However, a number of important inputs are supplied by private in- 
dividuals rather than by business firms. The worker who supplies labor 
time, the saver who supplies funds for investment, and even the small 
farmer may be considered to fall into this category. Each of these groups 
supplies items which they can also use for themselves. The worker can use 
in leisure pursuits the portion of his time which he does not sell, the investor 
can conserve the money which he does not lend out, and the farmer can use 
for himself at least some of the products which he does not sell. We say 
that each of these sellers has a reservation demand for his product—he 
wants to reserve some for himself. 

The amounts of such inputs which will be supplied then depend both 
on the quantities which are produced and the amounts which the sellers 
choose to demand for themselves. The theory of demand of Chapter 9 
therefore becomes highly relevant for the analysis of these input supplies. 

What will happen to the supply of such an item when its price rises? 
Usually we expect that a rise in price will increase the supply of a good, 
but we shall see now that in the reservation-demand situation this will not 
always be true—a rise in price may well cause a reduction in supply. 

To see how this works out, let us consider the supply of labor time pro- 
vided by one worker. He has twenty-four hours to divide between work and 
leisure. His desired labor supply, then, is simply what is left over from the 
twenty-four hours after his reservation demand for leisure time. Suppose 
there is a rise in the hourly wage rate—the price of his labor time. This 
means that the price per unit (per hour) of leisure time has risen. The effect 
on his supply of labor time can then be determined residually by deter- 
mining the effect of this price rise on his demand for leisure. 

As in the ordinary theory of demand, we can divide the effect of the 
price rise on his demand for leisure into the substitution effect (the effect 
of the relative rise in the cost of leisure compared to that of his other pur- 
chases) and the income effect (the effect of the change in his real purchasing 
power which results from this rise in price). As in the ordinary theory of 
the consumer, the substitution effect of a rise in the price of leisure will 
make him want to purchase less. Other consumer goods will have become 
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relatively cheaper, hence there will be more attractive ways to spend his 
money. Thus the substitution effect of a rise in wages will, indeed, tend to 
raise the labor supply since it will work to reduce the amount of time which 


he wishes to keep for himself. 
But the income effect will work out quite differently from the way it 


does with an ordinary consumer product. First, the income effect is now 
virtually certain to be much stronger. The consumer usually spends only a 
. small proportion of his income on any one product, so that a rise in its 
price alone will have very little effect on his real income. But a worker's 
income is largely or even entirely dependent on the sale of his labor time. 
Hence, a rise in hourly wages (the price of leisure) will have a substantial 
effect on his income and therefore, in turn, on his purchases. The income 
effect of a rise in the price of leisure will, therefore, be far more important 
than that of a rise in the price of shoes. 

A second difference between the reservation-demand and the ordinary 
consumer-demand cases is that the income effects in the two situations will 
ordinarily be of opposite direction. A rise in the price of something he buys 
reduces the consumer's real purchasing power and therefore tends to reduce 
the demand for the item—it works in the same direction as the substitution 
effect. But a rise in the price of something he sells—such as labor power— 
makes the seller richer and permits him to afford more of the good things 
in life—leisure among them. Thus the income effect of a rise in wages—the 
price of leisure—is likely to be an increased demand for leisure. The (very 
likely substantial) income effect usually works in the reverse direction from 
the substitution effect. The net result may well be that a rise in the price 
of leisure increases its reservation demand; that is, a rise in wages may 
reduce the supply of labor. In this way we may have a negatively sloping 
(so-called backward-rising) supply curve of labor. 

Of course, the individual worker does not usually have the option of 
reducing his working hours. If a factory is geared to a forty-hour week, it 
cannot very well suit the different preferences of individual employees by 
hiring some people for forty-seven hours and others for twenty-eight hours. 
But a negatively sloping supply curve of labor has nevertheless played a 
persistent role in the history of labor—via union demands for shorter hours, 
which accompanied rising hourly wages. The shorter work week has oc- 
curred with the consent—indeed, as a result of the demands—of an 
increasingly prosperous labor force. 

For similar reasons, the possibility of negatively sloping supply curves 
arises also in the case of savings. It has often been assumed that a rise in 
interest rates—the price of savings which are loaned out—will lead people 
to save more. But a rise in interest rates also increases the income of thc 
lender, and he may consequently prefer to increase the proportion of h: 
income which he spends on himself. As in the case of wages, and for exactly 
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the same reasons, the income effect of a rise in interest rates is likely to be 
substantial and in the opposite direction from the substitution effect—it 
will tend to make for reduced savings. The net result of the income and 
substitution effects is in this case in considerable doubt. Some have con- 
cluded that, for the community as a whole, the supply of savings which 
are available for lending is on balance relatively interest inelastic—a 
change in interest rate will make little difference to supply because the 
income and substitution effects will tend to cancel out. In individual cases, 
however, this will not always be so. Cassel and Keynes have described one 
extreme case in which the savings supply curve is likely to have a pro- 
nounced negative slope.?° Suppose a man is saving money and lending it 
out at interest with the objective of having enough to buy a boat when he 
retires in five years. If the price of the boat does not change, the higher 
the rate of interest, the less he will have to put away in order to achieve 
his objective. A rise in interest rate will therefore clearly decrease his 
motivation for saving because it increases his income from his lendings. 


8. Unions as Monopolies: Alternative Union Goals 


It is a standard observation that unionization has made the analysis of 
wage determination a matter for the theory of monopoly rather than that 
of competition. Indeed, the wage-bargaining process is, in some discussions, 
taken to be a case of pure bilateral monopoly, with negotiation and the 
decision-making entirely in the hands of a set of uhion representatives on 
the one side and a monolithic industry (management) group on the other. 
The theory of bilateral monopoly with its superim 
curves (as described in Chapter 16 
applied with little or no modification to the a 


supply, and that is the only price which is applicab] 
range of supply prices to be recorded in a supply c 


20See J. M, Keynes, The General Theory 


of Employment, Interest, 
Harcourt Brace Jovanovich, Inc., New York, 19; iain: 


36, pp. 94 and 182. 
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maximizing their profits, others their market share; still others may pursue 
hybrid objectives, and all of these are usually only vaguely and only 
implicitly defined. 

The plausible objectives of trade unions are perhaps even more diverse 
than those of business firms. Just to suggest the nature of some of the 
possibilities and to indicate some of their implications, let us consider the 
following three alternatives: 


1. That the union wishes to keep all of its members employed; 

2. That the union wishes to maximize the total income of its 
members; 

3. That the union wishes to maximize hourly wages and keep at 
least a steady core group of its members employed. 


Figure 4 shows some of the implications of these three objectives and 
demonstrates that they are likely to be incompatible.?! Here DD’ repre- 
sents the industry demand curve for labor.?? If the total membership of the 
union is OZ,;, the union which seeks to get jobs for all of its members will 
have to settle for wage OW, per worker because any higher wage will cut 
the demand for labor to below OE;. On the other hand, the union which 
wishes to maximize the total wage earnings of its members should demand 
wage OW», which corresponds to the point of unit elasticity, U, on the 
demand curve (for it will be remembered that where elasticity is greater 
than unity a fall in labor price will increase total industry expenditure on 
labor, whereas where elasticity is less than unity a rise in price will have the 
opposite effect). The total-wage-income-maximizing level of employment, 
OE», can also be identified by the condition that at OZ, the marginal 
revenue curve, DE» (corresponding to the demand = average revenue 
curve DD’), must cut the horizontal axis, i.e., the additional wage payment 
resulting from an increase in employment must be zero. 

Finally, if the union wishes to maximize wages for a small core group 
of its members, OE3, the corresponding maximum wage per worker is OW 3. 
Thus, depending on which objective it adopts, the union will find different 
policies appropriate, and there will be no one decision which effectively 
pursues all three objectives simultaneously. 

A union which maximizes the total wage receipts of its members 
(objective 2) must be prepared to accept the unemployment of what may 


21 The reader will recall the convention that inputs are measured as negative quan- 
tities. That is why the horizontal axis is taken to go to the left of the origin with higher 
negative values representing higher employment levels. 

22 The use of a demand curve implies that the industry is not a monopsonist—a single 
unified buyer of labor for whom there is no relevant input demand curve—for the same 
reason that the monopolist has no supply curve in the ordinary sense. 
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Figure 4 


be a substantial number of its members (EE). Nevertheless, such a 
policy can make good economic sense. If the total wage “take” of the em- 
ployed worker is somehow redivided among all of the union members (either 
by an unemployment assessment on the employed members or by long 
"vacations" which keep all members employed part-time), the income 
per worker (including those who are unemployed) will be higher than it 
would be if all union members were fully employed, for if the union mem- 
bership is fixed at O/,, then an increase in the total wage receipts, which are 
divided among this fixed number of men, must clearly raise the average 
wage level. 

There is no need to expand this list of possible union objectives. It is 
clear that the matter is more complex than the diseussion has indicated and 
involves considerations like the desire of the union leadership to stay in 
office, and the militancy of the membership. Enough has been said to 
indieate that no one a priori labor supply relationship is likely to be 
universally applicable.?? 

This completes our remarks on the supply of labor. Though they will 
not be discussed here, it must be emphasized that institutional considera- 
tions are also highly relevant for the analysis of the pricing of other inputs 
e.g., the rent of land and the interest on savings. 


9. Inputs in Fixed Supply: Land 


In standard analysis it is eustomary to treat some inputs as being abso- 
lutely fixed in supply. The economy is endowed with some set of natural 
resources, and there is nothing which can be done to change the amounts 


23 For a highly suggestive theoretical analysis of alternative union policy possibilities, 


see John Dunlop, Wage Determination Under Trade Unions, The Macmillan Company, 
New York, 1944. à d 


E 
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of land, mineral deposits, and other such items. The supplies of these 
objects are therefore taken to be of zero elasticity—no rise in price can 
increase the available quantities. 

In a geological sense this is perfectly correct, but from the economic 
point of view it is almost certainly false. What is important for our purpose 
is not the total territory of a country, but the amount which is in use; not 
the amount of oil under the ground, but the rate at which it flows into the 
pipe lines. But a sufficient rise in price can always be counted upon to 
increase the rate of flow of these items into the economy. More will be done 
to find new oil locations, and more speculative drilling will be undertaken. 
Mines which have been abandoned as uneconomie will be reopened, and 
less wasteful methods of mining will be developed and adopted. Poor land 
will be irrigated and fertilized. Only in the very short run, before there is 
time to do much about the level of production, will supplies of any inputs be 
fixed. And even then, it will be possible to do something. More of an input 
whose price has risen will be taken out of inventory and put into produc- 
tion; raw materials which become more expensive will be used more care- 
fully to reduce waste—more thought will be given to cloth-cutting patterns, 
and gold dust recovery procedures will be tightened up by the goldsmiths; 
finally, other inputs, which are more abundant, will be used in larger 
amounts to help the firm economize on the employment of these scarce 
items—if there is a shortage of equipment, it can be worked on a three-shift 
basis, thus increasing the labor/ capital ratio; if there is a shortage of one 
metal, another will be employed more frequently in its place. 

In sum, as Professor Viner has pointed out, input supply functions are 
virtually never zero elastic from the economic point of view except, pos- 
sibly, in the very short run. 

Land, in particular, is customarily treated as an input whose quantity 
cannot be varied by human decision. That was the notion behind its 
description by Ricardo as the original and indestructible powers of the soil. 
But we know that landfill activities have extended the land area of many 
if not most major cities. New York and San Francisco have grown in that 
way, and the map of contemporary Boston bears surprisingly little re- 
semblance to its eighteenth-century contours. Here, too, the available 
supply is affected by economic conditions—the value of real estate and the 
amount of wastes generated by the economy out of which the artificial 
land mass is often constructed. Nevertheless, in the discussion to which 
we now turn it is convenient for the exposition to treat land as an artificial 
limiting case—the input whose supply is absolutely fixed. 


10, Economic Rent as Surplus. Heterogeneous Inputs and Increasing Costs 


Though in the discussion of the Ricardian model the concept of rent 


referred, as it does in popular discourse, only to payments to landlords, 
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today the concept is applied more broadly, to any type of input which 
earns a “surplus.” By a surplus in this sense we mean any payment in 
excess of the amount necessary to have the input in question supplied. On 
our premise that the supply of land is fixed, any payment for the use of 
land obviously meets that criterion. Since land is there whether it receives 
any payment or not, the entire payment is a surplus from the point of view 
of society as a whole. Of course, it is no surplus from the viewpoint of any 
one industry or any one firm bidding for some land, for it must get that 
land away from others who also want to use it. It is precisely this competi- 
tion among bidders for such an input which leads to the payment of a 
surplus or rent.?* 

As & most obvious generalization of the Ricardian model, these pay- 
ments can occur in the bidding for inputs which differ in quality—not 
just lands differing in fertility, but equipment differing in state of repair 
and workers differing in "natural" skills. Consider two workers, c and d, 
both employed at similar jobs by different firms, and suppose that c is 
more productive than d, specifically, that the return on c's output per 
month is $1,000 and that the corresponding figure for d is $950. In the 
long run, competition among the firms will tend to make the monthly 
wages of c and d differ precisely by the difference in their total value 
output, $50, for, suppose that c's wage is only $20 higher than d's. It will 
then pay d's employer to offer to hire c (instead of d) at a wage increase 
of, say, $10 per month, thereby increasing his (the employer's) receipts 
by $50 and his wage payments by only $20 + $10 = $30, which is clearly 
profitable. But when d's employer makes this bid to c, it will pay c's 
original firm to try to keep him rather than being stuck with the less 
efficient worker, d. This firm will therefore be forced to bid c's wage up even 
higher, and so on, until c is receiving exactly $50 more than d—so that it 
will be equally remunerative to a firm to hire either of these inputs; c's 
wages will not be bid any higher than this, since if his monthly wage were, 
say, $75 more than d's when his output is only $50 larger, d's labor time will 
clearly be the better buy, and either c's wage will tend to fall or that of d 


will be bid up. In practice, of course, wage differences are never that closely 
matched to differences in productivity. 


24 The reader may well wonder how such surpluses can arise under pure competition 
in light of the Euler's theorem result of Section 4 of this chapter, which tells us that the 
total product is just used up by the marginal products paid to inputs. The answer & 
that in competitive equilibrium even rent surpluses are equal to the marginal products 
of the inputs that receive them. The landlord must, for example, be paid for each acre 
of land exactly the amount that would be yielded by an additional acre. A moments 
thought assures us that this must be so because a competitive firm pays every one of 
its inputs, land included, exactly the value of its marginal product. Thus 
that what from one point of view is a surplus, from another is 4 
J. B. Clark may have been the first to point this out. 


it is clear 
just a marginal product. 
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There are various obstacles which prevent firms from bidding against 
one another for their inputs and prevent inputs from moving from firm to 
firm, to where earnings are highest. Firms do not have complete information 
on the productivity of the inputs currently hired by other companies; 
there are transfer costs involved in moving an input from one firm to 
another, including the possible loss of seniority and pension rights for 
workers who move to a new firm, as well as transportation costs and 
family dislocation costs if a move requires a worker to change his home. 
Nevertheless, despite these reservations, the prices of inputs will often 
tend to reflect the differences in their total productivity. The rent on a 
more fertile piece of land will be considerably higher than that on a barren 
area. Land desirably located in the center of a large city rents for much 
more than land in a sparsely inhabited area. Skilled labor receives higher 
wages than unskilled labor. A new, efficient factory will sell for more than 
one which is obsolete. 

However, such rent payments arise not only because of heterogeneity 
in input quality. Rent arises from any source of increasing marginal costs 
as output expands. Whether rising costs stem simply from an increasing 
disproportion of input: quantities (more and more labor per acre of land), 
or because one must have recourse to increasingly inferior inputs, rent 
payments will arise alike and will follow an equivalent pattern, illustrated 
in Figure 5. 

Figure 5a describes a case with discrete input units. There, if one unit 
is produced, the marginal cost is $3, but with the production of a second 
unit, marginal cost rises to $7. Since competition does not permit unequal 
returns on different units of output, the cost of the first will be bid up by 
the imposition of a $4 rent (heavily shaded area). If a third unit is added 
to output, marginal cost increases to $9, and so rent on the first unit rises 
to $6, and a $2 rent is now earned on the second unit. The rent is now 
shown by the entire shaded area. 


cost COST 
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MC 
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But note that this rent is a transfer payment, not a real cost in the 
sense of using up labor and raw materials. The production of a second and 
a third unit of output has not affected one iota the amount of labor and 
raw material needed to produce the first unit of output. The resources cost 
of the three units together is indicated by the sum of the areas of the three 
unshaded bars = 3 +7 +9. The shaded area (the rent), while it is a money 
cost to the firm that produces the output, represents no resources cost to 
Society. 

Figure 5b shows the same thing in the more familiar case where mar- 
ginal cost is described by a continuous curve. Here the total resources cost 
of output y* is, as usual, the area under the marginal cost curve, Oy*DC. 
It is the sum of the marginal costs—the equivalent of the sum of the bars 
in Figure 5a. And for the same reason, shaded area CDE is the producers’ 
rent or surplus which goes to the suppliers of the firm's productive resources. 
Area CDE is the same as the producers’ surplus, which together with con- 
sumers’ surplus is often taken as the appropriate maximand in the analysis 
of welfare economics (Chapter 21, Section 2). 

One conclusion that follows is that total rent payment at any given 
level of output y is determined in a way that satisfies the condition 


Average Cost of y Including Rent — Marginal Cost of y Excluding Rent. 


In terms of Figure 5a, when y — 3, marginal cost excluding rent, is $9. 
But then the cost including rent of the first unit of output is also $9 — $3 
plus $6 rent and similarly, the cost of the second unit is also $9 = $7 + $2. 
"Thus, since rent brings the cost of every unit up to the cost of the marginal 
unit, we have our result: average cost including rent = ($9 + $9 + $9)/3 = 
$9 — marginal cost excluding rent. Obviously, the preceding argument 
does not depend on the particular figures chosen. If rent is set so thàt the 
cost of each and every unit of output (including rent) is equal to the 
resources (nonrent) cost of the marginal unit, then the average of all these 
(equal) figures must also equal that marginal cost. 

This result has some significance for welfare theory. We saw in Chapter 
16, Section 4, that the long-run supply curve of a competitive industry is 
the same as its curve of long-run average costs including rent. We see now 
that that curve is also the curve of long-run marginal cost excluding rent 
and this helps to explain the optimality of resource allocation under pure 
competition in the absence of externalities. For it shows, roughly, that 
where the demand and supply curves intersect, marginal social utility 
measured in money terms, as indicated by the height of the demand curve, 
will be equal to marginal resources cost, as given by the height of the 
supply curve. Thus, in competitive equilibrium marginal social utility of 
output will equal marginal resources cost, as optimality requires. 
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Capital 
and 
Investment Decisions 


25 


Capital budgeting refers to the investment decision-making proce- 
dures of business firms and other enterprises. The subject encompasses such 
topics as the selection of projects (which new factories, if any, should the 
company build), the timing of the investment, the determination of the 
amount to be invested within any given time period, and the arrangement, 
of the financial means necessary for the completion of the projects. The 
calculations which are appropriate for these decisions for the most part 
derive directly from the theory of capital. 

In keeping with the theoretical nature of the book, the materials of this 
chapter will not necessarily approximate capital budgeting procedures as 
they are currently encountered in practice. Rather,.the chapter will deal, 
as far as possible, with methods which approximate optimality. However, 
there is an extremely important limitation which must be emphasized from 
the very beginning. Imperfect foresight into the future, risk, and uncer- 
tainty will for the most part be ignored, because economists have not 
devised really effective methods for taking them into account in the analy- 
sis. Later in the chapter something will be said about the matter, but the 
reader will readily recognize the limitations of that discussion. Unfortu- 
nately, capital budgeting is the one subject where we can least afford to 
abstract from limitations in our knowledge of the future, because, by its 
very nature, the investment decision can only be justified in terms of its 
prospective effects. Nevertheless, we must proceed, as does most of the 
literature, without the presence of this leading character in the drama. 
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The principles which will be described in this chapter are equally 
applicable to a wide variety of investment decisions, to net investment as 
well.as to replacement, to scrapping and retirement. From the point of 
view of the analysis there is no structural difference in the decision which 
pertains to new investment and that which deals with replacement. When a 
company considers the purchase of a new machine, the fact that another 
one like it has served as a predecessor is, in a formal sense, a bit of irrelevant 
history.! The decisive question is whether the marginal profit contribution 
of the item justifies its acquisition, and if it does not, then the machine 
should go unpurchased whether or not it is a replacement item. Similarly, 
since scrapping or any other disinvestment decision (nonreplacement) can 
be treated as an act of negative investment, it is at least plausible that it 
should be based on the same principles and procedures as a decision to 
invest. 

It is important to emphasize that, in an optimal investment decision, 
any historical sunk costs, such as the machine’s employment of floor space, 
which would otherwise go unused, are totally irrelevant. For no current 
decision can change the past. Other examples of such irrelevant historical 
costs are the initial costs of provision of a railroad road bed, which should 
not affect the decision to purchase and run additional cars over the line, 
or the costs of a dam which have made a waterway navigable and which 
should not determine whether to operate another barge. In each case the 
added investment should be undertaken if and only if it more than pays 
for itself, whether or not it appears to bear its share of the outlays of the 
past. 


To economists, the terms “capital” and “investment” do not refer to 
quantities of money or their use in purchasing stocks and bonds. Rather, 
we take them to denote “real” assets—factories, raw materials, machinery, 
inventories of finished and half-finished goods (goods in process), etc. 
Capital, in sum, is any previously produced input or asset of a business 
firm or any other producer.? 


! Of course, if the firm previously possessed a similar machine, the company's labor 
force is more likely to be experienced in its operation, and this fact may contribute to 
the prospective profitability of a replacement. But this and other similar considerations 
must still enter an optimal investment calculation through their effects on anticipated 
profits. 

2 Tt may be noted that this is not how the term “capital” was used by Marx, who 
employed the term to represent both the physical assets without which the worker 
cannot put his labor power to use and the financial flows from these physical assets, 
both of which are the property of the capitalists. Physical capital, in whose construction 
the worker played a critical role, later confronts that same worker as “an alien object” 
without which he cannot function. Thus, capital to Marx is a social or institutional 
relationship, not a mere set of machines. It is only in the historical stage of capitalism 
that machines are transformed into capital by becoming simultaneously the instrv ments 
of control of the economic system and the private property of the capitalists. 
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Investment refers to the production or acquisition of any such real 
capital asset. Specifically, it is the time rate of increase of capital assets. 
If a firm has a capital of $17 million on January 1 and invests at the rate 
of $2 million per year, then its capital at the end of the year must be $19 
million. Symbolically, if we designate investment and capital in year t 
respectively, by I, and Cs, we must have? 


I, —,— C3. 
1. Discounting and Present Value: Valuation of Capital Assets 


Before we can go on to the substance of capital theory we must examine 
the fundamental principles of its arithmetic, describing the procedure which 
must be employed to compare present and future receipts and outlays. It 
is characteristic of capital, as we have seen, that its construction and 
maintenance call for expenditures at different dates and that its yields are 
obtained at still other times. 

Suppose, for example, that a $100 investment yields $20 at the end of 
the year and $25 at the end of two years, when the $100 is also returned to 
the investor. Would he have been better or worse off if he had, instead, 
received $22 each year? To answer such a question we must be able to 
compare the value of money (or other resources) at different dates. A 
dollar today, a dollar at the end of the year, and a dollar two years from 
now are all essentially different beasts. 

The sooner we receive our money, the better off we are, for the sooner 
it can then be put to work earning more money for us. Let us see just 
how much more a quantity of money is worth if it is received sooner. 
Suppose that P dollars were invested for one year at a rate of interest 7, 
compounded annually. Then at the end of the year it would yield ¿P 
dollars which, with the return of the principal, P, would give us P + iP = 
P(1 + i) dollars. Let us call the initial sum Po (P dollars at our initial 
date, year 0) and write the equivalent sum at the end of one year as P, 
(meaning P, dollars receivable at the end of year 1). Then we must have 
the expression 


P, = Pe(1- i). 


That is, Po dollars now must be worth the same as P; = Pol + ù 
dollars received at the end of the year. In other words, 


1 
Po = Pı = DP, 
dium 1 1 


3 Alternatively, we may use calculus notation to write I = dC/dt; investment is the 
rate of increase of capital over time. 
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where the symbol D, called the discount factor, is used to designate the 
fraction 1/(1 + 7). DP, is called the discounted present value of P, dollars 
receivable one year from today. For example, if the rate of interest were 
5 per cent so that 7 = 0.05, then we would have D = 1/1.05, and we would 
conclude that $100 receivable at the end of the year is today worth only 
(1/1.05)$100 = $95.24 (approximately), because at a 5 per cent interest 
rate $94.24 will grow to $100 in one year. 

What is the present value Po(2), of some amount, P; dollars, to be 
received two years in the future? If invested for one year, Po(2) will grow 
to PT = (1+ 2)P (2). In a second year this amount will increase again 
to (1 + 2PT = (04-20 + )Po(2) = (1 + 2)?Po(2). In other words, if 
P (2) now is to be equal in value to P3 receivable in two years, we must 
have P = (1 + 7)?P (2) or 


2 
PQQ) = (3) P, = DP; 


Similarly, at compound interest, Po(n) dollars invested for n years will 
grow into P, = (1 + Z)^Ps(i). So the discounted present value of P, 
dollars receivable in n years is readily seen [by division by (1 + 2)"] to be 


1 n 
Po(n) 2| —— P, = np. 
o ) G j n D ne 


This is the generalized formula of discounted present value which permits 
us to convert amounts payable or receivable at different dates into similar 
terms—they are all made comparable by being translated into their 
equivalent current value. 

After this translation, amounts of money pertaining to different dates can 
be added or subtracted directly. If a firm spends $90 today and receives $105 
a year from today, the net present value of the operation will be 105D — 
90 = (1/1 + 2)105 — 90. At a 5 per cent rate of interest (i = 0.05) this 
gives (1/1.05)105 — 90 = 10, that is, the net yield of the operation is $10 
in present value. 

More generally, suppose that a firm expects to receive Ro dollars 
currently, E, dollars in one year, Rz dollars in two years, etc. The total 
capitalized present value, C, of this stream of expected receipts is given by 


C = Ro + DR, + D? R5 +- - -+ D^R,, 


where, as before, D represents the discount factor 1/1 + 7. 
An important special case arises if all these expected receipts (or pay- 
ments) are equal, i.e., if we have Ro = Ry =---= R, = R. Then we 
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haveC = R+ DR+ D?R ------ DR — Re 4+ D 4- D? 4------ D7). 
Now the terms of any geometric series such as 1 + D + D? ------- D» 
(call it S,) can be totaled with the aid of the expression* 
Le Det 

1— D 
In the special situation where the stream of payments is expected to 


continue into the indefinite future, then, provided, as is normally the case, 
that D « 1, we obtain from (1) 


(1) Sn = (1+ D-- D? + D? --..-- D») = 


1 
= 2 se = è 
8,— 0+D+D +) 


For since D is less than unity, D? is less than D, D? is smaller still, and, 
generally, D” grows smaller and smaller as n grows larger and l.rger, and 
the term D” in (1) tends to disappear; i.e., it approaches zero as n 
approaches infinity. 5 


2. Discount Rate and Opportunity Cost; Real vs. Nominal Rates 


Before getting down to substantive conclusions it is appropriate to 
pause briefly to discuss the logic of the discounting process which has just 
been described. The discount factor I/(1 + 4) has been tied directly to the 
rate of return on investment. The discount rate is just a measure of what 
we lose by receiving our money later rather than now. It is the opportunity 
cost of not having those resources sooner. So long as there exists a perfect 
capital market, i.e., so long as a reputable businessman can borrow or lend 
as much money as he needs at the going rate of interest, the rate of interest 
is the required measure of the cost of postponing the receipt of money. For 
if he needs any money now, he can obtain it at once by borrowing and 
paying iP dollars per year for it, until the date when his postponed receipts 
finally arrive. Conversely, if he receives money now rather than later, he 


4 Proof: Sn 2 1-- D+ D? 4... D^, so that, multiplying and dividing by 1 — D, 


g, G+ D- D? +--+ Dy — D) 


1—D 
(Lo D+ D?+---+ D)-(D4 D? 4. D^) 
bi 1-D 
l- D^" 
-ip 


5 For a discussion of the formula for 


continuous discounting, Po— e- "!Pi, its 
rationale, snd its advantages in differentiati 


on, see the appendix to this chapter. 
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can always employ it to earn iP dollars per year by using it to pay off a 
portion of his debts or by lending it out at the current interest rate. 

If the capital market is imperfect—if the businessman can only borrow 
limited amounts, or if the rate of interest on his loans rises as he borrows 
more, or if the rate of interest he can earn by lending his money is less 
than the amount it costs him to borrow—the connection between the in- 
terest and discount rates is not so simple. But, in any event, the discount 
rate remains the opportunity cost of postponed receipts of money. For example, 
if the businessman is limited in the funds he can borrow, he may be 
unable to build a new plant which is capable of returning say 9 per cent on 
investment. In that case, the relevant loss from postponement of receipts 
is not any of the market rates of interest but the 9 per cent profit foregone 
on the most lucrative of the investment opportunities from which he is 
precluded by not having his money now. 6 


In a period of changing price levels it is important to distinguish 
between real and nominal rates of interest. Suppose an investment of 
$1,000 brings in $100 but the price level is rising at an annual rate of six 
per cent. The nominal rate of interest is then ten per cent but in fact the 
investor is earning far less than that. For each year his $1,000 investment 
loses approximately 6 per cent or $60 in purchasing power. Thus, of his 
$100 nominal earnings, only $40 are a net return, with the remaining $60 
just serving to make up for the loss in purchasing power of his capital as it 
is eaten up by inflation. In sum, we have the basic relationship 


real rate of interest — nominal rate of interest — rate of inflation. 


Failure to recognize this relationship can lead to serious misunderstandings. 
For example, a public used to 6 per cent rates of return on bonds at an 
earlier date is likely to consider a twelve per cent return to be exorbitant. 


5 For many purposes the interest concept itself must be extended to cover more 
than the direct pecuniary payment to lenders. Suppose a businessman has money tied 
up in inventory which he will sell in two years. Its interest cost, for our purposes, may 
be considered to consist of all the expenses which are incurred simply as a result of the 
passage of time during the period when the funds are kept in illiquid form in inventory 
holdings. In addition to payments to the bank on the money tied up in the inventory, 
there are several other types of cost of the relevant variety. Among these are taxes on 
inventory holdings; insurance costs; costs of warehousing such as inspection, rental, 
etc.; costs of spoilage and pilferage; obsolescence; and so on. The longer the funds are 
kept tied up in inventory, the longer these costs accumulate. In some timing decisions 
in practice—for example, in the decision as to how long to age brandy—tax rates on 
inventory holdings may play a more important role than do interest rates proper, 
particularly in the minds of the businessmen involved. Similarly, to the holder of stocks 
of, military equipment or style goods such as women's clothing, the expected rate of 
obsolescence may be far more important than the interest rate proper. 
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However, if the rate of inflation has simultaneously risen from 1 per cent 
to 10 per cent per annum, we see that the real rate of interest, far from 
rising, has in fact fallen drastically from 5 to 2 per cent. The change is 
surely no bonanza for the investor! 

We turn next to a fundamental issue: the criterion to be used in . 
determining the profitability of a proposed investment project. Three of 
the criteria that are most frequently considered are the payout period, the 
internal rate of return (marginal efficiency of investment) and the dis- 
counted present value. These will be discussed and evaluated in the next 
three sections. 


3. Payout Period 


A criterion which is frequently employed to judge the profitability of an 
investment in practice is its payout (or payback) period. For example, if a 
factory costs $7 million to construct and is expected to yield $2 million per 
year, its payout period is taken to be three and one-half years. The payout 
period of a project, then, is defined as the number of years which is required 
to accumulate earnings sufficient to cover its costs. The payout period 
criterion ranks projects in terms of this figure, asserting that a project with 
a four-year payout period is generally to be preferred to a six-year payout 
investment and that two projects, each of which is expected to have a 
three-year payout period, are, without further information on anything but 
their prospective profit yields, to be viewed with indifference. 

The payout-period criterion is a crude rule of thumb and it is rarely 
defended in the literature except as an easy and inexpensive device for 
dealing with risk—a role for the criterion which will be considered later in 
this chapfer. It is described here because an explanation of its shortcomings 
is illuminating. 

The basic weakness of the payout criterion lies in the limited period of 
time which it takes into account. A piece of equipment may well continue 
to operate for many years after it has covered its initial costs. Whether or 
not it will do so is highly relevant for determining the profitability of its 
purchase, but this element is nowhere taken into account by the payout 
calculation. To take an extreme example, consider two items with an equal 
cost and an equal payout period of seven years. Suppose the first of the 
candidate items is not likely to last much beyond these seven years, while 
the other may be expected to remain in use for a considerable amount of 
time thereafter. The latter is then clearly the better investment, but this 
it totally ignored by the payout criterion, which rates the two items equally. 

Moreover, the payoff criterion completely ignores much of the time 
pattern of receipts. For example, an investment whose cost is 7 (hundred 
thousand dollars) will be paid off in three years by either of the following 
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earnings streams: stream A which yields $1 in the first year, $1 in the 
second, and $5 in the third, or stream B which offers $5 in the first year and 
$1 in each of the following two years. Yet it is clear that most firms will not 
be indifferent between the two propositions and, as a matter of fact, they 
will usually prefer stream B, which returns the investors’ money much more 
promptly. 


4. Marginal Efficiency of Investment 


A second criterion, the internal rate of return or marginal efficiency of 
investment, carries behind it the prestige of some of the great names of 
economics. Keynes was among those who employed this measure of the 
profitability of an investment. Actually, this criterion does not work out 
too badly in most cases, but we shall see that, in principle, and sometimes 
in practice, it is subject to serious shortcomings. 

The marginal efficiency of an investment project is defined as that rate 
of interest or return which would render the discounted present value of its 
expected future marginal yields exactly equal to the investment cost of the 

„project. For example, consider a project whose anticipated yield is $10 per 
year in perpetuity (beginning at the end of the year) and whose initial cost 
is $100. If the rate of interest were 9 per cent [so that 7 = 0.09, and the rate 
of discount would be 1/(1 + 7) = 1/1.09], then the present value of this 
income stream would be 10(1/1.09) + 10/(1.09)? + -.. = 10/i = $111 
(approximately)." Clearly then, since 111 is greater than the $100 value of 
the original investment, our trial interest rate, i = 0.09, is not equal to the 
marginal efficiency of investment. We need a higher discount rate, and 
hence a higher value of 7 to reduce the value of the income stream down to 
the 100 investment cost. If we try 7 = 0.11, we will find we have gone too 
far in the other direction, because the capitalized present value of the $10 
stream will now be equal to 10/1.11 + 10/(1.11)?+---= 10/i = 91 
(approximately), which is less than the $100 investment cost. If, finally, 
we try a 10 per cent interest rate (i = 0.1), we find that the present value 
of the income stream is exactly $100, so that 10 per cent is the marginal 
efficiency of our investment. This measure is sometimes also called the 
project’s internal rate of return, because it evaluates the internal profit- 


7 The general formula for the present value of K dollars in perpetuity, beginning at 
the end of the current period, is K/i. This is so because by footnote'4 of this chapter 
[and writing D = 1/(1 + %)], it is equal to 

K(D+ D?+---) = KDA + D+ D?+---) 
= KD{i/(1 — D)) = KD/ü — D) 


ees a. qM 
iio ata-i A 
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ability of a specific investment project, and it may have no relation to the 
company’s cost of capital. 

The marginal efficiency criterion tells management to undertake an 
investment so long as its marginal efficiency exceeds the rate of interest (or 
other costs) which the company incurs when it obtains more money. The 
argument is that if the company pays a 6 per cent rate of interest and can 
use these funds to purchase an item which will yield a stream of returns 
which can be evaluated at 10 per cent, then it will always pay the company 
to do so. And the argument is correct so far as it goes. The most importan 
difficulty of the concept arises in circumstances not considered in the pre- 
ceding discussion. Suppose that for some reason company management is 
limited in the number of investment projects which it can undertake. It 
must then assign priorities and settle upon the combination of projects 
which promises to be most profitable. Here marginal efficiency of invest- 
ment rans into difficulties as a guide. To see how its problems arise we must 
compare it with the discounted present-value criterion. 


5. Discounted Present Value vs. Marginal Efficiency 


We have already seen earlier in this chapter how one calculates the 
discounted present value of the stream of returns expected to result from 
an investment project. The discounted present value criterion tells us 
simply that a project will be profitable if the discounted present value of 
its expected earnings is greater than its cost (including discounted future 
maintenance and operating costs). That is, it tells us to invest in a project 
if the discounted value of its revenues minus its costs is greater than zero. 

To compare the behavior of the.marginal efficiency concept with that 
of the discounted present value let us tabulate the results of the com- 
putation which we have just completed. This is done in Table 1. We see 
that since at a 9 per cent interest rate the present value of the illustrative 
investment is 111 (as we had already calculated), the net gain from our 
$100 investment is $111 — 100 — $11. A similar figure is shown for 
interest rates of 1 per cent, 2 per cent, etc. 

We can now depict these results graphically in Figure 1, where we 
plot the net investment yields (their net discounted present value) against 
the various alternative interest rates. We see that the marginal efficiency of 
investment is the iriterest rate, E, at which our curve, VV’, cuts the hori- 
zontal axis. For, by definition, the marginal efficiency of investment reduces 
the present value of the investment to the present value of its cost, so that 
its net yield must then be zero. 

On the other hand, to determine the discounted present value we must 
know the appropriate rate of interest. Suppose the market interest rate, C, 
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TABLE 1. Present VALUE or $10 Per YEAR IN PERPETUITY 
at VARIOUS DISCOUNTING INTEREST RATES 


———————————————————————————— 
Interestrate? 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1 0.11 


Discounted 

present 

value 

V — 10/7 1000 500 333 250 200 167 143 125 111 100 91 


Net discounted 

present 

value N — 

V — $100 900 400 233 150 100 67 43 25 11 0 —9 


———- eee 


is 5 per cent (¢ = 0.05). Then, as point A indicates, the net discounted 
present value of the investment is $100. 

The marginal efficiency of investment criterion tells the businessman to 
invest if the marginal efficiency of investment is greater than the interest 
rate (the marginal cost of capital), i.e., in the diagram if the intersection 
point E lies to the right of point C, the market interest rate. On the other 
hand, the discounted present-value criterion approves any investment 
whose net discounted present value, CA, is positive. We see, then, that in 
our illustrative case the two criteria are in agreement, and, in fact, we can 
normally expect them to be so. For, ordinarily, a rise in the discounting 
interest rate will reduce the present value of an investment, so that VV’ 
will have a negative slope. Suppose in such a case that the net present 

value of the investment is positive (point A is above the horizontal axis at 
the current interest rate C). Then E, the point at which VV’ crosses the 
horizontal axis, must clearly lie to the right of C, i.e., the marginal efficiency 
must also exceed the current interest rate. 

But our net discounted present-value curve need not always have a 
negative slope, and that leads to the first (though, practically, perhaps not 
very important) shortcoming of the internal rate of return. Suppose, for 
example, that a proposed investment project is expected to generate the 
following income stream in the five years of its anticipated life: first year 
—$100, second year $90, third year $110, fourth year —$60, fifth year 
—$60. We see that the project generates losses both at the beginning and 


5 This point and the one which follows were first called to the attention of economists 
by Lorie and Savage. See J. H. Lorie and L. J. Savage, “Three Problems in Capital 
Rationing,” Journal of Business, Vol. XXVIII, October 1955, reproduced in Ezra 
Solomon (ed.), The Management of Corporate Capital, The Free Press of Glencoe, New 
York, 1959. In the following chapter we will see how this issue has recurred in pure 
capital theory in the debate over “reswitching of techniques.” 
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end of its life (the latter frequently occur in extractive industries, where 
the cost of closing a mine is high, or they may arise out of financial obliga- 
tions incurred by the project over the span of its existence, obligations 
which are not covered by the investment’s earnings in the fourth and fifth 
year because of the declining productivity of the item). Then VV’ will 
take the form depicted by the solid portion of the curve in Figure 2 and will 
intersect the horizontal axis twice, once at R and once at S. The reason this 
will occur is that with a very low rate of interest, the negative returns in 
the fourth and fifth year will hardly be discounted at all, so that, together 
with the initial outlay, they can swamp the middle-period profits. For 
example, if the rate of interest were zero so that future returns were given 
exactly equal weight with current returns, our project would be evaluated 
at —$100 + 90 + 110 — 60 — 60 = —20. However, with an intermediate 
rate of interest, the fourth- and fifth-year losses will fall in relative im- 
portance, and so the stream will assume a positive present value. Finally, 
if the interest rate is extremely high, nothing but the current year figure 
(the $100 initial loss) will retain any appreciable value after discounting; 
and so the stream will once again be ascribed a negative present value— 
hence the humped shape of the VV’ curve in Figure 2. 

Now in this diagram it is not clea 


i T whether point R or S is to be called 
the marginal efficiency of investment.® Moreover, the marginal-efficiency and 


. °It is also perfectly possible to encounter VV’ 
Intersections with the h 
original Vy" 
investment 


curves which contain several more 
orizontal axis, such as T' and U, in the broken extension of our 
curve. The entire problem arises because to find the marginal efficiency of 
Ree br we seek to determine the discount rate, D = 1/1+ i, which satisfies 
ant i+ D?R2 +--+ D*R, = 0, where Ro, Ri---, R, is the stream of expected 

returns from the project. Since this is an nth-degree eq 
as n distinc 


uation, it may have as many 
t real roots, i.e., n different marginal efficiency figures. 
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discounted-present-value criteria need no longer supply the same answer 
on the acceptability of a given investment project. If C is the current’ in- 
terest rate and R is taken to be the marginal efficiency of our investment, 
then the project should be rejected on the marginal efficiency criterion, 
because R « C. But at interest rate, C, the net discounted present value 
of the investment, CA, is positive, and so, on the present-value criterion, 
the project is profitable. 

More serious, however, is the shortcoming of the marginal efficiency 
eriterion which shows up when, for some reason, we are forced to choose 
between two mutually exclusive projects. Let their respective net dis- 
counted present-value curves be VV’ and WW’ in Figure 3, and let us refer 
to the corresponding projects as v and w. Then we can see at once that with 
the current interest rate at C, project w has the larger discounted present 
value (CB > CA) but project v is characterized by the larger marginal 
efficiency of investment (OS > OR)! 

Which, then, are we to choose? The answer is straightforward if the firm 
can borrow all the money it desires at C per cent so that it is some other 
consideration which has forced it to choose between v and w. For example, 
v and w may be alternative warehouse facilities and the company may need 
no more than one additional warehouse. In that case the present value is 
just what the title implies. It is the value which the rational firm must 
necessarily assign to a project, as was shown in Section 2. The project 
which has the largest discounted present value, by definition, makes the 
est net contribution to the wealth of the firm. If some other criterion 


larg 
that other criterion must somehow miss the 


appears to tell us otherwise, 


point and we must ignore it. ; 
The reason for the disparity of the two calculations is to be ascribed 


to their implicit treatment of two investment projects of different duration. 
Suppose one project brings in $100 and another brings in $105 one year 
later. How is the advantage of the earlier payment to be evaluated—at the 
rate of interest actually available on the market, as assumed by the dis- 
counted present value method, or is the hundred dollars to be assumed to 


Ma 
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earn the project's internal rate of return? That is, which of these two rates 
Should be used in discounting the $105 to make it comparable with the 
$100? If the relevant project is in fact terminated when the $100 is received, 
that investment opportunity is now foreclosed and so only the market rate 
of interest is the relevant investment advantage offered by the earlier 
receipt of the $100. The answer given by the method of discounted present 
value is then the correct one. 

Where there are limitations to the firm's borrowing ability, i.e., where 
its capital is rationed, and where that is the reason it is forced to choose 
between v and w, no such simple and categorical judgment can be offered, 
for in that case there is no well-defined current interest rate. The interest 
rate which the firm happens to pay on the money it borrows does not 
measure the true opportunity value of cash to the firm. If capital is really 
rationed, it means that management would like to borrow more at the 
current interest rate. Therefore the marginal value of further capital to 
the firm must exceed that interest rate. It is therefore no longer possible to 
make a direct identification of i, the rate of interest appropriate for the 
determination of discounted present value. We shall see in a later section 
the method of caleulation which is appropriate for project selection where 
capital is rationed. It may only be indicated at this point that the problem 
is particularly intractable to methods of analysis like those which have just 
been described if capital is rationed to the firm in several of the periods 
when outlays will have to be made. For if the firm can only raise, say, 
$750,000 during the next year and no more than $1,250,000 during the year 
after, the marginal opportunity value of money in the two periods need not 
be the same. Even if the company needs the money more urgently during 
the former of these periods, it cannot arrange a transfer of funds from the 
second period to the first when borrowing is effectively rationed. It is as 
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though an investment had two distinct prices, one involving payment in 
pounds and one in francs, and the purchaser were short of both types of 
currency and could not convert from one to the other. A very complex 
type of balancing between scarce dollars today and scarce dollars tomorrow 
is required in such a situation, and correspondingly more subtle calculations 
are required for an optimal selection of investment projects.! ° 


6. Illustration: Use of the Discounted Present-Value Criterion 


Case 1. A simple problem 


Given investments A and B with expected payoffs as indicated in Table 
2 below, which of these produces the higher present value, given a dis- 
counting rate of 6 per cent? With an interest rate of i = 0.06 the discount 
rate is D = 1/1 + i = 1/1.06 = 0.94 (approximately) so that we have 


TABLE 2 
Year of Return 
Present lst 2nd 3rd 
Investment A —1,000 500 700 500 
Investment B —600 400 800 0 


D? = 0.89, D? = 0.84 (approximately). Therefore, the approximate dis- 
counted present value of investment A is 


—1,000D° + 500D + 700D? + 500D* 
= —1,000 + 500(0.94) + 700(0.89) + 500(0.84) = $513, 


while that of investment B is 
—600D° + 400D + 800D? = —600 + 400(0.94) + 800(0.89) = $488. 


Hence A is the more valuable investment in terms of net discounted present 
value. 


10 For other shortcomings in the discounted present-value analysis see J. Hirschleifer, 
“On the Theory of Optimal Investment Decision," Journal of Political Economy, Vol. 


LXVI, August 1958, reproduced in Solomon, op. cit. 
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Comparing this result with the payout period calculation, we see that 
investment A pays off $500 of its $1,000 cost after its first year, leaving the 
remaining $500 to be paid off during the second year when earnings accrue 
at the rate of $700 per year. Hence the payout period is one and five- 
sevenths years, approximately. Similarly, B requires approximately one 
and one-fourth years to recoup its investment cost. Thus, B has the shorter 
payout period and is the better investment on this crude criterion. But 
note that our payout calculation has taken no account of the fact that 
investment A is expected to earn $500 during its third year, while it is 
anticipated that B will yield nothing during that period. 


Case 2. A more complex problem 


To illustrate further the method of use of the discounted present-value 
criterion a typical problem from the literature of replacement theory (in 
operations research) will now be described in detail.!! 


Problem: When to replace equipment with rising maintenance costs. A 
firm pays $2,000 for its automobiles. Their operating and maintenance 
costs are about $500 per year for the first two years and then go up by 
approximately $300 per year. When should such cars be replaced? 

First let us look at the problem in algebraic terms. The annual outlays 
during years 0, 1, 2 etc., of a given car's life, until it is replaced, are Co = 
2,000 (the cost of the car), C1 = 500, C2 = 500, C; = 800, etc. If the car is 
replaced annually, we will have an outlay of Co = $2,000 every year. If itis 
replaced every two years, we will have the outlay stream Co, C1, Co, C1, 
Co, : - - whose elements repeat themselves every two years, and, in general, 
if it is replaced after ¢ years, our stream of costs, S, becomes Co, C1, C2,++-, 
C1, Co, Cis Co, +++, Cr, Co, +++. To put these cost streams on a comparable 
basis, we translate them into an equivalent fixed annual payment, that is, 
an annual payment of a dollars whose discounted present value is the same 
as that of our stream, S. Since the elements in stream S repeat themselves 
every t years, we need only consider the present value of C'o,Ci,---, C, 
and the present value of the a dollar payment for the same t years. With the 
discount rate, 1/(1 + 2), represented by D, the present value of the a is 
given by [see Equation (1) of this chapter] 


1— ptt 


y -àudDaqee gu 
a a+ +Da=a TEET 


, 


11 This illustration is adapted from M. Sasieni, A. Yaspan, and L. Friedman, Opera- 


d eer, Methods and Problems, John Wiley & Sons, Inc., New York, 1959, 
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so that 
1—D 
"=p 


Here, since V is the discounted present value of the outlays on the car for 
the ¢ years before it is replaced, we have 


V = Co + DC, + D?C; 4-- -- 4- DiC. 


Our object is to find the value of t which makes the fixed annual equivalent 
cost, a, as small as possible. Suppose, now, that the opportunity cost of 
having the receipt of money postponed for a year is such that a dollar one 
year in the future has a present value of only 90 cents, so that D — 0.9. 
That is, suppose that management estimates that it can invest its addi- 
tional cash in such a way that by the end of the year each 90 cents will have, 
on the average, increased to $1.1? We may then calculate a as a function of 
the length of life of an automobile in Table 3. 


TABLE 3 
Years (t) C, D ¥ (1 — D)/ü — Dt*) a 
0 2,000 1 2,000 1 2,000 
1 500 0.9 2,450 0.53 1,299 
2 500 0.81 2,855 0.37 1,056 
3 800 0.73 3,439 0.29 997 
4 1,100 0.66 4,165 0.24 1,000 
5 1,400 0.59 4,991 0.21 1,048 
6 1,700 0.53 5,892 0.19 1,119 


Clearly, the table indicates that a is at a minimum at about t = 3 or 4. 
That is, the cars should, optimally, be replaced at something between 
three- and four-year intervals. 

Before leaving this example two extensions will be noted very briefly. 
First, observe that the scrap or trade-in value of the car when replacement 
takes place is easily included in the calculation. The amount received for 
the old car need only be considered a negative cost for that year and should 
be deducted from the value of Ci. 

Second, exactly the same computation can be employed in deciding 
whether to switch to a more expensive vehicle which costs less to maintain. 
We need merely compare the minimum fixed annual equivalent cost for 


1? This is tantamount to having an interest rate, ?, of approximately 11.1 per cent, 
for D = 0.9 = 1/1 + i, so that solving for 7 we obtain i = 1/9 = 0.111---. 


a —— 
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our $2,000 car, a, with the corresponding figure for the more expensive 
vehicle. The car which yields the lower a figure will clearly do the job 
most inexpensively in the long run. 

The artificiality of the case which has been described should be apparent 
enough; however, it should adequately indicate the nature of replacement 
analysis. 


PROBLEMS 


1. Compute the discount rate if the interest rate is 

(a) 5 per cent 

(b) 7 per cent. 
2. Suppose the interest rate is 5 per cent and that a certain investment is expected 
to yield $500 at once, $700 at the end of one year, $200 at the end of the second 
year, and then be scrapped. Using the equation a+aD+---=a/1—D 
calculate how many dollars paid out each year in perpetuity is equal in present 
value to the capitalized (ie., discounted) present value of our investment. 
(Hint: Let a dollars per year be the amount paid in perpetuity. What is the 
capitalized value of this stream? What is the present value of the earnings of 
our investment? What is the value of a?) 
Given three investment projects described in the following table: 


wo 


Project Current Payoff 1st Year Payoff 2nd Year Payoff 


A —300 
B —300 


C —300 


(a) Assuming that the $300 loss in the current year represents the cost of the 
project, calculate the payout period of each investment. 
(b) Calculate the marginal efficiency of each investment. 


(c) With the interest rate at 10 per cent, calculate the discounted present 
value of each project. 


(d) Which project is best on each criterion? 
7. Indivisibility, Interdependence, and Capital Rationing Problems 
Frequently, the investment decision is beset by a number of complica- 


tions which require that its analysis employ tools which are rather more 
sophisticated. We shall see in this section that in order to satisfy the basic 
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principles of optimality, the investment decision process must sometimes 
employ the methods of integer programming.!? 


Four such complications merit particular attention: 


1. The first of these problems, which has already been mentioned, 
grows out of the funds limitations which sometimes circumscribe the invest- 
ment decision. In the presence of capital rationing one is no longer con- 
cerned with the selection of an ideal set of investments whose acquisition 
might well require resources beyond those available to the firm. Rather, 
one must determine the best course which can be followed with the limited 
funds on hand. 

2. A second problem arises because investment projects are, characteris- 
tically, indivisible, all-or-none propositions. If funds run short, it may be 
possible to scale down the plans for a proposed new building. But it is 
generally not possible to purchase 6.307 machines of a given variety. In 
practice, management is frequently offered several fairly definite alterna- 
tive sets of specifications which can meet the requirements of a given job. 
It must then choose one among these alternatives, or it can reject them all, 
but frequently it cannot rescale or combine the options. This means that in 
deciding how many units of a given item a firm should acquire, fractional 
answers must be rejected as meaningless. And where an all-or-none decision 
is called for, the answer must be restricted to two possible values—0 or 1. 
Either one of the contemplated office buildings will be constructed, or the 
project will be abandoned altogether. 

3. A third special characteristic of many investment decisions is that 
the available alternatives must be considered mutually exclusive. If man- 
agement decides on project A, project B is thereby rejected automatically. 
Such a case arises, for example, where several alternative designs are 
proposed for a single warehouse or a single factory. 

4. Our final source of analytical difficulty inherent in many investment 
decisions is encountered where one project should not be undertaken with- 
out another. The company should not purchase weaving machinery (pro- 
ject C) without a factory in which to house it (project D). Here project D 
may or may not be undertaken in the absence of C (the building might be 
used for other purposes), but C should not even be considered unless D will 
be adopted. 

Let us examine these issues in turn, though we shall see at the end of 
the discussion how they can be handled together with the aid of a single 
caleulation procedure. The approach required by the funds limitations 
which restrict managenient's freedom of action should be obvious to anyone 


13 This was first emphasized by H. Martin Weingartner, Mathematical Programming 
and the Analysis of Capital Budgeting Problems, Prentice-Hall, Inc., Englewood Cliffs, 


N.J., 1963. 
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who has studied linear programming or differential calculus.'* The funds 
limitations simply must be treated as the constraints of the problem. To 
formulate our basic programming model, we shall require the following 
notation: Let 


x; represent the number of units of investment 7 which are under- 
taken (e.g., the number of trucks which are purchased by the firm) ; 

N; be the net discounted present value of the expected future returns 
on one unit of investment 7; 

m, represent the amount of mouey required during period ¢ to finance 
a unit of investment 7; 

M; be the total amount of money available for investment during 
period t£. 


Then our objective function requires us to maximize the net discounted 
P s p a A 
present value of all projects which are undertaken, i.e., maximize! ? 


N = Nizi Hee t Nitr 


This is to be accomplished within the limited resources available to the 
firm as represented by the constraints corresponding to each of the pertinent 
time periods (from the present, t = 0, until some terminal date, t = w). 


moti + maoto +--+ Mot, < Mo 


Q) Mirti + Morta +++ mite € My 


My wly + Mouta +--+ mim X My. 


14 The application of programming to this problem is due to Weingartner, ibid. 
For what may be the first such formulation employing differential calculus see Helen 
Makower and W. J. Baumol, “The Analogy Between Producer and Consumer Equi- 
librium Analysis," Economica, N.S. Vol. XVII, February 1950. 

15 Strictly speaking, the following construction oversimplifies matters considerably 
by assuming we know the interest rate with which tu discount the returns from each 
project. For, since capital is rationed in our problem, the appropriate discount rate 
itself depends on the best use which can be made of the funds. In other words, the 
discount rate depends on the optimality calculation and vice versa. In principle, the 
solution of the problem requires a nonlinear programming analysis in which the dual 
prices are the marginal yields of money in the various periods, and, hence, the appropriate 
discount rates. In this problem the discounted present values, the N;, themselves are 
functions of the dual prices, and hence are variable. 

The following formulation also assumes that management is considering only a 
fixed number (i) of alternatives. These need not all be undertaken in the current period. 
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And, as usual, we also require 


(3) zı 20, z220, ---, z 20. 
Let us review the interpretation of each of these relationships. Our 
objective function multiplies N;, the net present value per unit of project 
k, by z; the number of units of 7 which is undertaken, to yield the total 
return from investment i. These returns for all potential projects are then 
added together to represent the total yield of any investment program 
which the company may adopt. 
Our structural constraints (2) are constructed similarly. For example, 
the first of these deals with the capital requirements of the initial period, 
= 0. It states that the sum of the moneys required in period zero for all 
investment projects which are undertaken, 2, m;oz;, must not exceed Mo, 
the quantity of money capital which will be available to the company 
during this period. Finally, our nonnegativity requirements (3) indicate 
that it is technologically impossible for the firm to construct a negative 
number of factories.! ê 

These three sets of expressions represent our limited-funds investment 
problem in its entirety as a standard linear programming problem. It can 
then be solved by the standard linear programming techniques which have 
been described in Chapter 5, to yield an optimal investment program for 
the firm. If a problem which is encountered in practice involves significant 
nonlinearities, a nonlinear programming calculation can be substituted, in 
principle, though the very serious difficulties of data gathering and compu- 
tation which such an analysis may incur cannot be overstressed. 

Next we turn to the second difficult characteristic of the investment 
decision problem, the indivisibility of many investment projects. This 
means that our quantity variable z; cannot meaningfully be permitted to 
take fractional values. If x; represents the number of freight cars purchased, 
we can have z; = 1, or-5, or any other integer value, but x; = 3.72 is non- 
sense. In such circumstances we must therefore add to our previous re- 


But the amount of a project to be undertaken in, say, period 5, must be represented by a 
variable z; different from that (z:) which corresponds to a similar project to be initiated 
in an earlier period. Of course, a project to be undertaken in period 5 is likely to require 
no investment funds in earlier periods, i.e., it may be characterized by the conditions 
Mjo = mj = Mj2 = Mja = Ms = 0. 

16 A firm may, in fact, disinvest by selling some property. Such an activity can be 
handled by representing its amount by another variable z», where, since this symbol 
represents the amount sold, we may again require z, > 0. For a method which deals 
with the possible dependence of the decision to sell on a prior decision to purchase see 
the discussion of interdependent investment decisions below. 
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quirement in our programming model 
(4)  z;integer (for all z corresponding to indivisible projects). 


It will be recognized that this immediately transforms our construct 
into an integer-programming model. Highly promising integer programming 
techniques are already in existence, and these have frequently proved to 
be very effective and powerful. 

The indivisibility of investment projects sometimes gives rise to a 
particular type of situation where the decision involved is whether or not 
to undertake a particular project. There is no question as to the number 
of these facilities—either the item will be constructed or it will not be. 
Thus the number of units of the item in question cannot exceed unity, i.e., 
the corresponding z; can only take either the value 0 or 1. If z; = 1, the 
project will be adopted; if z; — 0, it will be rejected. This requirement can 
easily be incorporated into our model by means of the constraint 


(5) z; € 1l. 


For by (3) we must have z; > 0, and by (4) z; must take an integer 
value. There are then only two possibilities left, x; = 0 or x; = 1, just as we 
require. Our programming calculation then handles the all-or-none problem 
simply by the addition of constraint (5). 

We come now to our last two special investment decision problems: 
mutually exclusive projects and the case of a project whose utility is condi- 
tional upon the adoption of some other investment. If a number of invest- 
ments are mutually exclusive, a nonprogramming calculation becomes 
extremely difficult in the presence of funds limitations because of the 
combinatorial problems which arise. If investment B costs less than in- 
vestment A, then should the company choose to invest in B this will re- 
lease funds which can be invested in yet another project or in several other 
projects. Thus A is the better project only if its return is higher than that 
offered by the most lucrative available combination of projects (where 
project B is included in the combination). Such a complex decision problem 
can easily be handled by our programming model. If A and B are our 
mutually exclusive alternatives, we need merely substitute for constraint 
(5) the condition 


(6) Za +t <1, 


where z, and zs are the respective quantities of projects A and B under- 
taken. Since we already require x, > 0, xs > 0 and since both of these 
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variables are restricted to integer values, we see that only three possibilities 
are permitted to us by condition (6): 


i ow oes, z, = 0 

H. Xy 0; mcm 
or 

1B. d =} zy—0 


Our solution therefore permits us to undertake at most one of the two 
projects, which is exactly what is required in the case of a pair of mutually 
exclusive alternatives. Precisely the same sort of device (with the con- 
straint this time being x. + z& + £e < 1) will clearly work if only one of, 
say, three projects, A, B, and C, can be undertaken. 

. Finally, suppose some project C is not to be undertaken unless D is also 
adopted. We can arrange for our calculation to take this difficulty into 
account by requiring it to satisfy, instead of constraint (6), the condition 


(7) Ze < Za. 


By (3), (4), and (5) za must either be 0 or 1. If xz is 0 (project D re- 
jected), then by (7) we must also have x. = 0, so that C will then auto- 
matically be rejected also. If, on the other hand, xz = 1, we may have 
either x. = 1 or ze = 0. Thus project C will certainly not be undertaken 
unless D is, but otherwise there is complete freedom of selection in the 
decisions on these two projects. 

Let us summarize the results of this section. We have examined a num- 
ber of problems which frequently beset investment decision calculations. 
While the difficulties to which these give rise have only been indicated very 
briefly and with no detail, the fact is that until recently no general method 
was known for finding an optimal solution to an investment decision in- 
volving any one of them. We have seen, however, how all of these circum- 
stances can readily be incorporated into a programming model and how 
the corresponding programming calculation can be forced to yield an 
optimal solution which takes all of these requirements into account. 


PROBLEM 


Suppose five investment projects are under consideration and that they 
invelve the following net present values and cost commitments: 
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Project 1 


Net present value | 70 
Year 1 cost 20 


Year 2 cost 


Suppose that the company has 40 (million) to invest in the first year, and 30 in 
the second, and that projects 2 and 4 are mutually exclusive alternatives. Write 
out the programming formulation of this problem. 


8. Risk and the Investment Decision 


Up until this point the problems of risk and imperfect foresight have 
been ignored. But these are really crucial for the investment decision, which 
necessarily yields the bulk of its fruits in the future, sometimes in the very 
distant future. This section therefore reviews briefly some of the approaches 
which have been proposed for dealing with the consequences of our imper- 
fect ability to predict events which have not yet occurred. We shall start 
with the most heuristic and operational methods of procedure and then we 
will outline some of the more subtle and abstract methods which have been 
devised. 

a. The finite-horizon method. In practice, a number of decision proce- 
dures (which will be recognized as first cousins of the payback criterion) 
have just laid "own a terminal date beyond which any prospective de- 
velopments are simply left out of consideration. For example, in deciding 
whether to construct a dam or to undertake some other waterway de- 
velopment project, the Federal government has frequently adopted a 
fifty-year horizon, treating all facilities as though they were certain to dis- 
appear without a trace exactly one-half century after their erection. The 
logic of the procedure is the view that any forecast for a period longer than 
fifty years is so unreliable that it is best not undertaken at all. 

But the economist cannot accept this resolution of the problem except, 
possibly, as a crude device which saves decision-making costs. After all, 
the view that a dam will vanish like the one-horse shay, precisely on its 
fiftieth anniversary, is itself a prediction, and a highly implausible one at 
that. A routine decision which implicitly treats this conventional flight of 
fancy as though it were fact can lead to some rather peculiar and indefensi- 
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ble conclusions. Suppose we are to decide between two canal locks which 
cost roughly the same and are equally serviceable in their early years. One 
of them, however, will probably last little more than fifty years, whereas 
the other is likely to remain serviceable indefinitely. The fifty-year horizon 
precludes us from deciding in favor of the latter even though it is so clearly 
preferable. Many public investment decisions do have considerable long- 
term effects which cannot be ignored in this way. Particularly the conserva- 
tion effects of a project may become really important only in the fairly 
distant future, when in their absence, soil depletion, serious flooding, and a 
number of other untoward consequences might well assume significant, if 
not catastrophic, proportions. A finite and arbitrary horizon, then, is not 
really a defensible method for dealing with imperfect foresight. It takes no 
account of our limited ability to predict events in the more immediate 
future (which is sometimes as distant as twenty-five years from the present) 
and forces us to ignore totally what little we can forecast about the more 
distant future with some degree of confidence. 

b. Discounting for risk. A procedure for dealing with risk which is far 
more attractive than the finite-horizon method is the use of a risk discount 
factor. This procedure consists in an addition to the rate of interest figure 
employed in the discounting calculation. For example, suppose the actual 
rate of interest is 6 per cent; the rate used in discounting might then be 
increased by what we can call a risk factor of, say, 1 per cent, to a total of 
7 per cent for a mildly risky prospect, and we might add a risk factor of 
3 per cent to get a 9 per cent total where a much more speculative invest- 
ment is in question. Such a risk discount, ô, always reduces the value of the 
discount rate, for the discount rate becomes, under its influence, D = 1/1 + 
t+ 8 « 1/1 + i. In other words, the higher the risk, the more we lower 
our evaluation of a given expected return because (if that return is expected 
n periods in the future) we multiply it by a smaller fraction, D" (the nth 
power of the discount rate, D), where, of course, both D and D" always lie 
between zero and unity.!? 


Since in the discounting process more distant returns are multiplied by 
higher powers of the discount factor, this procedure automatically assigns a 
higher weight to the risk factor in more remote future periods, unless (as is 
always possible) we use different risk factors for different dates. However, 
the risk factor will still take some account of the risk involved in more 
proximate returns and will never completely drop out of the calculation 
for expected returns for any future date, however remote. Thus, since the 


17 For if the interest rate and risk discount are positive, we have D = 1/1--i-- à < 1, 
and if the interest rate and risk discount are finite, we must, by the same equation, have 
D » 0. And with any such value of D we must have 0 < D^ < 1 for any 0 < n < ©. 
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discount rate, D, lies between zero and unity, any power of D will also lie 
in the same interval. Thus, since, say, the second year return Ro is multi- 
plied by D? = (1/1 +i + 3)? in discounting, its present value, D?Rs, will 
be somewhat reduced because of the presence of the risk factor, 5. But a 
nonzero return expected even in the 154th year, R154 > 0, will have a 
positive present value, D'°*R,54, because D!5* > 0. Thus we see that the 
risk-discounted method has some rather desirable properties and is rela- 
tively easy to handle. Its basic difficulty is that it comes with no explicit 
instructions which permit us to calculate the appropriate value of à, the 
discount factor, and it must usually be estimated on the basis of some sort 
of judgment or intuition. Since, in any event, it must take into account the 
degree of the investor's risk aversion, i.e., the extent to which he is repelled 
by risk, the evaluation of à must perhaps necessarily remain subjective in 
most cases. Moreover, the choice of a unique risk-discount factor to be 
used to discount all future revenues also assumes implicitly that the 
riskiness of investment is never affected by the passage of time, a premise 
which is certainly not always true. 

c. The probability theory approach. A third procedure for the investiga- 
tion of risky investment prospects bases itself on standard probability 
theory. The approach points out that no single expected-return figure can 
adequately represent the full range of possible alternative outcomes of a 
risky undertaking. Rather, a large number of alternative payoffs must be 
considered for each pertinent future date, and each such possibility must be 
accompanied by an associated probability. If the return at date t is repre- 
sented by R: and if only a finite set of values of R, is considered possible, 
we must deal with a probability function P = f(R;) which asserts that, for 
example, there is a 5 per cent probability that the return in year ¢ will 
exceed $100,000, that the probability is 8 per cent that the return will fall 
between $90,000 and $100,000, etc. 

The risk-discount method clearly takes no account of this full range of 
possibilities and their associated probabilities. It can therefore be argued 
with justice that the risk-discount approach ignores elements which are 
important in an effective calculation of an optimal investment policy. 

Unfortunately, for most applications the discussion of the probabilistic 
approach has not proceeded much beyond criticism of the risk discount. If 
the full probability function were ever known (which is rarely, if ever, the 
case for any investment project), standard actuarial methods of evaluation 
could be used. 

But even these procedures are fully defensible only if many similar 
investments are to be undertaken. An insurance company can confidently 
employ actuarial calculations because, with a large number of policyholders, 
the cases which turn out badly are virtually certain to be counterbalanced 
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by the cases which turn out unexpectedly well. But in an investment deci- 
sion which will never be repeated, the justification for actuarial calculations 
is somewhat more shaky. It follows that (with the probabilities typically 
unknown) such a method is often neither practical nor fully defensible. 

However, there do exist some relatively standardized assets which have 
been in use sufficiently long to make possible at least crude estimates of 
some of the characteristics of the payoff-probability distribution. An 
important application is Markowitz’ powerful analytic procedure which 
utilizes such an approach for the selection of portfolios of securities— 
combinations of stocks, bonds, and other financial instruments which 
constitute a significant portion of the total investment of certain types 
of company. 

The Markowitz approach utilizes two focal measures: an index of 
expected earnings and an index of risk. Given any one security, one can 
estimate its anticipated average future earnings by extrapolation of past 
experience and the use of judgment based on knowledge of the issuer and of 
the circumstances of the market as a whole. On the same basis, one can 
calculate a rough figure for the standard deviation of these earnings which 
serves as the basis for the construction of a measure of risk. Using pro- 
gramming methods, Markowitz then calculates for any given level of 
expected earnings what portfolio (i.e., which combination of securities) 
minimizes the index of risk. For example, Figure 4 indicates that at an 
earnings level E the optimal portfolio will incur risk R. By ealculating more 
such points we obtain AA’, which we may call the risk-earnings possibility 
curve, i.e., the curve which tells us for each attainable level of earnings the 
smallest risk which can be incurred. 

Now the entire curve consists, in some sense, of optimal combinations 
among which one cannot choose a priori. One investor may prefer lower 
risk even if it reduces his expected earnings, and so he may desire a rela- 


RISK Attainable a’ 
Risk- Earnings 
Combinations I, 
H ape 
ool, 
R 
o E Earnings 


Figure 4 
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tively low risk point like L. Another investor who is more inclined to gamble 
may prefer a high expected earnings, high risk combination such as H. 

For any prospective purchaser of such a combination of assets it is pos- 
sible, in principle, to draw in a risk-earnings indifference map such as the I 
curves in the figure. These curves, it will be noted, are inverted in shape as 
compared with the standard indifference curves of the theory of consumer 
demand. This is because, other things being equal, we usually prefer lower 
levels of risk, i.e., we usually prefer points which are lower in the diagram. 
The curves have a positive slope because as risk increases, we require a 
higher level of earnings to keep us indifferent. If the diagram is turned 
upside down so that, as usual, higher points correspond to preferred combi- 
nations, we see that the indifference curves look quite normal.!? For the 
investor shown, the optimal combination is, as usual, T, the point of 
tangency between the possibility curve and an indifference curve, because 
that gets him to his preferred earnings-risk combination, i.e., it tells him 
the portfolio which most effectively satisfies his goals. 

In practice, the Markowitz caleulation can take data pertaining to 
relatively large numbers of securities and select the “efficient” portfolio 
combinations from among them (i.e., combinations which are optimal in 
the sense that they correspond to points on the possibility curve AA’). 
However, the choice of the investor's preferred portfolio from among these 
efficient combinations is still left to the judgment of the individual investor 
and his attitudes on risk.!? 


18 The reader will recognize that the slope of the indifference curve is the investor's 
marginal rate of substitution between risk and earnings. The flatter the curve, the less 
the added risk he is willing to undertake for a given increase in earning, i.e., the less of 
a gambler he is. 

1? One interesting result of the Markowitz analysis is that it will usually lead to the 
choice of a diversified portfolio of investments rather than one best (highest present value) 
investment. By not putting all of our nest eggs into one investment we are, in effect, 
hedging against the possibility that one of the projects will turn out very badly. Since 
the Markowitz calculation explicitly takes into account the effects of diversification on 
risk, it normally will recommend against concentration of funds on one or a very few 
types of investment. Recently Sharpe, Lintner, and Mossin have virtually simul- 
taneously used the portfolio analysis as a basis for a powerful model of capital asset 
pricing. Assuming that all investors can borrow and lend as much as they want at a 
fixed rate of interest and that, given expected earnings, they seek to minimize the risk 
of their portfolio, the model leads to the curious conclusion that each investor will 
want to hold some stock of every firm in the economy and that any one security will 
constitute the same percentage of each investor's portfolio! See W. F. Sharpe, “Capital 
Asset, Prices: A Theory of Market Equilibrium under Conditions of Risk," Journal of 
Finance, Vol. 19, September 1964, pp. 425-42; John Lintner, ‘The Valuation of Risk 
Assets and the Selection of Risky Investments in Stock Portfolios and Capital Budgets,”’ 
Review of Economics and Statistics, Vol. 47, February 1965, pp. 13-37; and Jan Mossin, 
“Equilibrium in a Capital Asset Market," Econometrica, Vol. 34, October 1966, 
pp. 768-83. 
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d. Sensitivity analysis and risk. Still another method for taking risk 
into account in investment decisions employs what may be called sensi- 
tivity analysis. Here one decides which variables in the calculation are most 
crucial and most uncertain, and he tests how sensitive the calculated 
present-value figure is to likely changes in the value of this strategic vari- 
able. For example, suppose that a company is considering investing in a 
new product and that its plans are based largely on its hope of capturing 
12 per cent of the market for this type of product within the next year. 
From this and its other information it can estimate in the usual manner 
(1) the expected present value of the investment if its expectations are 
fulfilled, (2) the effect of alternative possible market share figures on the 
calculated present value of that investment (all other things remaining 
equal), and, in particular, (3) the market share figure at which the net 
present value of the investment is zero, i.e., the break-even market share. 

Suppose it turns out that this break-even market share is 8 per cent, so 
that unless it attains this market share within a year the investment will 
yield an absolute loss. This offers the firm some indication of the risk in- 
volved in the investment and enables management to associate with this 
project a subjective risk figure. in the same way, management can obtain 
such a risk index and a measure of expected yield (or present value) for 
each prospective investment. One can then proceed to construct a curve of 
attainable risk—earnings combinations as in Figure 4—and to select an 
optimal project or combination of projects as in the Markowitz procedure. 
Of course, a sensitivity analysis is more subjective and- less powerful 
analytically than the Markowitz method. However, sensitivity analysis 
can be used in cases where the probabilistic information required by 
Markowitz is simply unavailable, and the relatively crude sensitivity 
calculation can be performed rather quickly and inexpensively. 

e. The use of decision theory and Neumann-Morgenstern utility. The 
most recent, and perhaps the most sophisticated, methods which have been 
suggested for coping with risk and uncertainty are the Neumann-Morgen- 
stern utility index and the criteria of decision theory. Since neither of 
these is, at least for the present, available in a form which permits its 
direct employment in concrete investment decisions and since both types 
of analysis are described in detail in other chapters in this book, there is no 
point in rediscussing the concepts here. However, the reader may well 
considerably increase his understanding of the entire problem by referring 
back to those chapters at this point. 

We conclude, then, that among the variety of methods for dealing with 
risk which have been described, only three—the finite-horizon, the risk- 
discount, and the sensitivity-analysis approaches—are readily usable in 
practice. All three of them suffer from serious shortcomings, but economic 
analysis suggests that the latter two, the risk-discount and the sensitivity 
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approaches, are far preferable to the former. Indeed, experience suggests 
that in many concrete applications they are reasonable and relatively 
trustworthy methods. In addition, the Markowitz approach can be ex- 
tremely helpful, but it can be utilized only where reasonable estimates of 
the pertinent probabilities can somehow be obtained. 


PROBLEMS 


1. Suppose the risk discount, r, is 4 per cent. Recalculate the discounted present 
values of investments A and B in Table 2 of Section 6, assuming that the rate 
of interest, 7, remains at 6 per cent. 

2. Explain why though the risk discount is used for both projects, it has made B 
the better investment, whereas the calculation which ignored risk indicated 
that A was the more lucrative project. 


9. Financing Investments: Alternative Methods 


As is well known, there are several alternative methods whereby a 
firm can obtain the funds with which to finance an investment project. 
This section will list some of the more important of these financing tech- 
niques and will offer some relevant comments. In the next section, some 
of the considerations pertinent to an optimal financing program will be 
examined. 

a. Plowback. By far the greatest portion of corporate investment in the 
United States is financed out of funds which are acquired internally. That 
is, rather than paying company earnings out to stockholders, they are 
plowed back into the firm. It avoids the heavy transactions costs which 
must be incurred in borrowing funds or in issuing new securities. It incurs 
less uncertainty for management, which can be sure, of any funds it holds 
back, while it may not be completely confident of the results of an attempt 
to market new stocks or bonds. Finally, many stockholders seem to prefer 
this method of financing because it tends to transform income into capital 
gains by increasing the value of the company’s stocks rather than providing 
dividend payments. This can offer a substantial tax advantage, particularly 
to shareholders in the upper income brackets. 

b. Issuing New Shares of Common Stock. A second method whereby a 
company can obtain cash is through the sale of new common stock. Except 
from the point of view of any powerful stockholders who exercise real con- 
trol over company policy, a share of stock can be considered to represent 
another form of loan to the company. True, it is more risky than a bond, 
but the shareholder receives from it dividends and capital gains if he is 
fortunate, and he can attempt to get his money back (terminate his loan) 
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by selling the stock to another potential lender. The usual point of view 
that the stockbolder is part owner of the company and that the bondholder 
is not is therefore a somewhat oversimplified view of the real distinction 
between the two. 

c. Sale of Bonds. Management may borrow money directly by selling to 
the public some variety of bond, e.g., a mortgage bond, which supposedly 
pledges some piece of company property as security for the loan, or a 
debenture bond, which represents a general pledge of part of the value of 
the firm. The use of bond financing offers a distinct tax advantage since 
interest payments to bond holders are legally considered to be costs and 
are therefore not subject to the corporate income tax, whereas dividends, 
the corresponding payment for money obtained by the sale of stock, are 
considered to constitute income for the company and are therefore taxable. 

To stockholders the sale of bonds is sometimes considered to be an 
advantage because it provides what is called leverage. Bonds are normally 
leas risky than the stocks of the same firm because interest on its bonds 
represents a prior claim on company earnings which must be paid before 
any dividends are provided. Moreover, by holding a bond until maturity 
(the termination date of the loan which it represents) the bond holder can 
be sure of receiving the face value of the bond, provided the company 
remains solvent. The traditional view of the matter, then, is that their 
comparative safety makes bonds a relatively inexpensive way for the 
company to obtain money, i.e., the interest cost on a bond may be expected 
to be lower than it would be if the bond, as it were, absorbed its share of the 
company’s risks, that is, interest cost normally is lower than the yield on a 
stock which is necessary to induce someone to purchase it. Hence, the higher 
the proportion of the external funds financing an investment project which 
is obtained by the sale of bonds, the greater the share of the returns from 
that inv.stment available to current shareholders. If the return on an in- 
vestment is 9 per cent, the interest on a bond is 5 per cent, and that on 
money obtained otherwise is 7 per cent, then the net yield to current stock- 
holders of the investment will be 4 per cent if it is financed entirely by 
bonds. But if it is financed half by bonds and half otherwise, so that the 
average payment for money is 6 per cent, the net return on the investment 
will be only 3 per cent. 

However, there is a catch to this. In effect, bond holders absorb less 
than their share of the company risk by passing it along to the stockholder. 
The more bonds there are outstanding, the greater the danger to. share- 
holders and the company. If it has sold only so many bonds that its annual 
interest commitment is $100,000, then net annual earnings (before interest 
payment) of $250,000 will suffice to keep the company out of trouble and 
will leave $150,000 for plowback or dividends. However, if so many bonds 
are outstanding that the contracted annual interest payment amounts to 
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$400,000, a $250,000 level of earnings can lead to very serious results, and 
if continued long enough will result in insolvency. In other words, the 
leverage provided by the sale of additional bonds represents both an in- 
crease in the expected earnings of the stockholder and an increase in his 
risk.?? It also means that the price of the company’s stocks is likely to 
fluctuate more widely. Since his share of company earnings is a residue 
after the fixed-interest obligation, he does not have to share the company’s 
unexpected prosperity with the bond holder. But, on the other hand, he 
gets no help from the bond holder in absorbing company losses. The stocks 
of the highly levered company, i.e., of the company with a high proportion 
of bonded indebtedness, are therefore likely to rise sharply when the market 
expects the firm to prosper and to fall drastically when adversity is 
foreseen. 

From this it follows that a conservative managerial group (and a con- 
servative stockholder) will typically dislike large amounts of bond financing 
because it magnifies the speculativeness of the company and its shares. 
We will discuss presently whether this reaction is entirely justified. 


20 To illustrate how increased bondholding is likely to increase the variance and 
likely range of stockholder earnings consider Table 4. 


TABLE 4 
1. Total earnings 100 200 à 300 400 500 600 
2. Earnings per share at 100 shares 1 2 3 4 5 6 
3. Earnings after interest ($200) —100 0 100 200 300 400 
4. Net earnings per share at 50 
Shares —2 0 2 4 6 8 


Here we have a company which is considering issuing either 100 (thousand) shares of 
stock, or 50 shares and 50 bonds. If it does the latter, it will incur a fixed interest obli- 
gation of 200 (thousand). The first row of the table simply lists some alternative possible 
earnings levels, ranging from 100 to 600. Row 2 indicates the corresponding earnings of 
any one of the 100 shares if no bonds were issued. Row 3 subtracts the fixed $200 interest 
debt from each possible earnings figure, and, finally, row 4 (= 1/50 times row 3) lists 
the earning to each of the 50 shares which would be outstanding under the stock-bond 
financing arrangement. Note that the lowest level of earnings per share has dropped 
from +1 in the pure stock case to —2 in the bond-stock case, while the highest earnings 
level has risen from 6 to 8! There are two reasons for this phenomenon. The fixed interest 
obligation reduces net stockholder earnings and produces losses at lower earnings levels. 
On the other hand, the reduced number of shares increases earnings per share once total 
earnings go beyond some minimal level. 

Note also how leverage magnifies both gains and losses in earnings. In our example, 
when total earnings (line 1 in the table) rise by 100 per cent from 300 to 600, the levered 
earnings per share (line 4) rise 300 per cent, from 2 to 8. But if total earnings decrease 
25 per cent from 400 to 300, earnings per share fall by 50 per cent, from 4 to 2. 
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d. Convertible securities, direct loans, etc. There are other instruments 
for financing which need not be discussed in detail. Firms may borrow 
directly from banks or insurance companies or Federal agencies. They may 
also issue hybrid securities, such as convertible bonds. A convertible bond 
is a bond which the holder can, subject to certain restrictions, trade in for 
company shares at a prespecified rate of exchange. A convertible is often 
safer than either a stock or a bond because the holder can, when he sells it, 
dispose of it either as a bond or as a corresponding number of stocks, 
whichever happens to have the higher market value at that time. Con- 
vertibles are issued by companies that wish to provide a somewhat safer 
type of security or that believe that the true value of its stock is not yet 
realized by the market. When the price of the company’s stock rises suffi- 
ciently, the holders of convertibles will all find it profitable to transform 
them into stocks, and this will then automatically eliminate any bonded 
company indebtedness which the securities originally represented. Con- 
vertibles are also particularly salable because they are highly valued by 
some institutional purchasers whose rules of operation prohibit them from 
buying common stocks directly and who can therefore acquire them only 
through the purchase and conversion of this type of issue. 


10. On Optimal Financial Policy 


It would appear, then, that in financing its investment, management has 
a very considerable range of real options and that careful calculation is re- 
quired for an optimal decision between the issue of bonds or common stock 
or between dividend payment and plowback (earnings retention). However, 
in a series of recent articles?! Professors Modigliani and Miller have shown 
that these alternatives are not as different as they seem and that, in the 
absence of special tax problems, transactions costs; and market imperfec- 
tion, there is little, if anything, to choose. 

a. Effects of leverage: stocks vs. bonds. First of all let us examine the 
effects of added leverage on the shareholder, i.e., let us see how his interests 
are affected by the choice between the emission of more stocks and addi- 
tional bonds. Modigliani and Miller show that in a market with no imper- 
fections, taxes, or transactions costs the shareholder can arrange for 
himself any degree of leverage he desires. That is, if the company's leverage 


21 See Franco Modigliani and M. H. Miller, “The Cost of Capital, Corporation 
Finance and the Theory of Investment," American Economic Review, Vol. XLVIII 
June 1958, reproduced in Solomon, op. cit., and M. H. Miller and Franco Modigliani, 
"Dividend Policy, Growth and the Valuation of Shares," Journal of Business, Vol. 
XXXIV, October 1961. 
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is too high for his taste, he can take steps to reduce the leverage which per- 
tains to his own shares, or, if he wishes, he can increase the leverage asso- 
ciated with his holdings. What management does in this respect is thus of 
little or no interest to him, and the company’s leverage decision con- 
sequently has no effect on the marketability of its securities. 

To understand the argument we must see how one can decrease leverage. 
Though the answer is not complex, it is rather tricky and its comprehension 
may require the reader’s concentration. One can reduce leverage to any 
desired degree simply by purchasing bonds and selling a number of stocks 
of equal value, thus reducing the risk of one’s holdings. One trades a 
problematical income for a contractually fixed income, and the higher the 
proportion of the latter, the lower the leverage represented by one’s hold- 
ings. Suppose the company has three times as many stocks as bonds 
outstanding. The individual investor who holds stocks and bonds in a three 
to one ratio has completely '"unlevered" his holdings, i.e., his risks and 
returns are exactly the same as if the company had issued no bonds and he 
had purchased only stocks with the funds he had invested. For if any in- 
vestor owns K per cent of the company's financial instruments, he will be 
entitled to exactly K per cent of its earnings, no matter what form those 
instruments and earnings may take. 

Thus, we see that in the circumstances envisaged, any shareholder can 
arrange for homemade leverage" sufficient to undo a managerial decision 
on leverage. By buying bonds he can reduce leverage and by the sale of 
bonds he can increase it. If he holds no bonds, so that he has none to sell, 
he can borrow money and achieve the same effect, for borrowing amounts 
to the same thing as selling bonds, since a bond is a loan, i.e., a “negative 
debt." Hence the welfare of the stockholder, in pursuit of whose interest 
the corporation is presumably run, is in this perfect market world totally 
unaffected by the choice between the issue of new stocks and new bonds. 

Let us now introduce a “degree of risk" index which we can measure for 
any particular firm. It follows from the preceding discussion that in these 
ideal circumstances all firms with equal risk must yield the same earnings 
return on their total capital (stocks plus bonds), no matter what their 
leverage. For if one company's securities were underpriced in the sense that 
they earn a higher return per dollar of investment than the securities of 
another company, it would pay each investor to switch his funds from the 
too high-return company and then use homemade leverage to arrange for 
the degree of leverage he prefers. This would raise the price of the under- 
valued securities and lower the price of the other company's securities 
until they offered the same rate of return. Thus, in algebraic terms, if S is 
the total value of a company's stocks, B is the value of its bonds, E is com- 
pany earnings (before deduction of interest), and r is the rate of return, we 
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must have 


(8) = r (constant) 


= 
S+B 
for all companies which are equally risky in some measurable sense. 

We may deduce that the earnings per dollar of market value of shares 
increases linearly with the degree of leverage, i.e., with the proportion of 
the company’s bond capital.?? In other words, a graph with earnings-per- 
unit share value on the vertical axis and number of bonds on the horizontal 
axis will always be a straight line in these circumstances. That is, the higher 
proportion of its new investment which the company finances by means of 
bonds, the higher the return to each shareholder. Moreover, since the rela- 
tionship is linear, it means that the marginal earnings yield per stockholder 
of an additional bond will be absolutely constant; there will be no dimin- 
ishing returns to stockholders resulting from increments to the company's 
issuance of bonds. 

Of course, as we have already noted, this ıs by no means pure gravy. As 
we have seen, the higher the amount of bonds, the greater the stockholder's 
leverage risk. This is why stockholders do not seek unlimited increases in 
leverage. As a matter of fact, that is precisely the explanation of our linear 
relationship. It tells us, in effect, that in equilibrium, a linear increase in 
earnings per share will just suffice to compensate the stockholder for the risks 
which accompany enhanced leverage. This price-earnings pattern is the means, 
according to Modigliani and Miller, by which market forces prevent the 
stockholder from either gaining or losing from an increase in leverage. 

b. Dividend payment vs. income retention. Moreover, Modigliani and 
Miller have shown that in such circumstances, the shareholders’ interests 
are unaffected by the decision between plowback and the issue of new 
shares in the current period as the means to finance a given investment 
project. This argument also is fairly subtle. We assume that none of the 
company’s future plans and, in particular, investment decisions will in any 
way be affected by the decision in question. Now the value, V, of all com- 
pany shares outstanding at the end of the period is determined by the ex- 


22 For if the interest rate per bond is i, so that the total earnings of the bond holders 
is iB, the amount left for stockholders will be E — iB, and the earnings per dollar of 
share value will be (E — 7B)/S. Then, since by (8) E = rS + rB, the earnings per 
dollar of stock will be 


E-i Sai =r+ (r — “B/S, 


which, so long as r > #, clearly increases linearly with the value of B. 
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pected value of future capital gains and future dividends which are by 
assumption given. V may therefore be treated as a constant, unaffected by 
the current decision between use of plowback and new stock financing to 
pay for the fixed level of investment. Suppose, therefore, that instead of 
financing all of the project out of current earnings, management pays the 
current shareholders D dollars in dividends which would otherwise have 
gone into the investment. To make up the deficit, the company must sell 
D dollars worth of additional securities to new stockholders. Let us see how 
the initial stockholders come out of all this. The value of all the company’s 
shares at the end of the period is V dollars, as we have seen. But D dollars 
worth of these shares now belong to others. Hence our original stockholders’ 
shares are now reduced in value to V — D dollars. However, in addition 
they have received D dollars in cash. Therefore the net value of their assets 
is (V — D) + D — V! That is, the entire procedure has produced ab- 
solutely no change in the well-being of the original stockholders! 

It would appear, then, that management has absolutely no grounds for 
choosing among methods of financing. In this crazy world, flipping a coin 
or consultation with an astrologer will produce an optimal decision since 
plowback or added stocks or added bonds are all equally happy decisions 
from the viewpoint of the current stockholder.”* None of this seems to 
accord with common sense or the judgment of those experienced in these 
matters. We shall now see what has apparently gone wrong with the argu- 
ment and why, despite its apparent conflict with the realities, the 
Modigliani-Miller analysis is so instructive. 

c. Effects of transactions costs, taxes, and market imperfections. It has 
been emphasized from the beginning of this discussian that the absence of 
transactions costs, taxes, and market imperfections is assumed. Now we 
are entirely unused to a world without these characteristics and are sur- 
prised by its properties, as we would be by the physicist’s theoretical con- 
struct of a world without friction, when a ball, once thrown into space, 
might continue in flight forever. Why do some stockholders prefer earnings 
to be retained? Because they thereby escape three costs: (1) the heavier 
income tax on dividends which they would have to pay instead of the lower 
tax on capital gains, (2) the transactions cost they would incur if they 
decided to reinvest their accumulated dividends through the purchase of 
more stocks—if dividends are retained they are automatically reinvested 
by the company, with no brokerage charges falling on the stockholder, and 
(3) the company avoids the transactions outlay required to raise funds to 


23 From the foregoing it follows that the cost of capital is, from the point of view of 
the stockholder, independent of the means whereby the funds are obtained. This con- 
clusion, of course, is strictly true only in our highly simplified, frictionless world. 
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replace the dividends as a means for financing its investment program. 
These transactions costs, including the costs of compliance with govern- 
ment regulations, can be very substantial and represent a real drain on 
company earnings. 

We can also understand in these terms why some other stockholders 
prefer the company to pay out dividends rather than retain earnings. Some 
investors who are uncertain about a company’s future may wish to limit 
the amount of money they invest in it—that is, they want to keep their 
investment at a fixed level rather than have it grow automatically over a 
period of time. In that case, it may be less expensive for a small investor 
whose tax rate is relatively low to receive a substantial dividend than to 
undertake the transactions costs incurred by selling stocks as earnings are 
plowed back into the company. 

In addition, some small investors prefer high dividends because they 
live on these regular payments. If there were no transactions costs, such a 
person could, as the value of his shares rose because of plowback, sell a 
corresponding small number of shares or fractions of shares. He would 
thereby receive his regular income without any consequent decline in the 
capital value of his holdings. But such a procedure brings with it brutal 
transactions costs. The smaller the number of shares involved in a given 
sale, the larger the brokerage cost per share, and when the number of stocks 
involved is very small, this charge becomes prohibitive. The regular 
monthly or quarterly sale of a few shares as a means for keeping up & 
steady income flow is just totally impractical. 

We conclude that decisions on payout versus borrowing versus the issue 
of new stocks can matter to the company and to stockholders. They matter 
in the short run because a temporary high evaluation (overevaluation) of 
the company by the stock or bond markets may render outside capital 
particularly inexpensive as a means to finance investments by sale of 
stocks and (to some lesser degree) bonds. Moreover, dividend policy may 
have at least temporary effects on the value of company stocks. If, as is 
frequently alleged to be the case, many investors believe that the market 
considers high-dividend shares to be a better buy than other stocks which 
are comparable in all other respects, then the price of high-dividend stocks 
may be driven up accordingly. For even if the purchaser of such a share 
does not think it is worth more than a share in a high-plowback company, 
he will have to take into account the possibility that potential purchasers 
will take this view when he decides to sell the stock in the future. 

Decisions on financing also matter in the long run because of transac- 
tions costs, taxes, and uncertainties, as we have already seen. As a result of 
all this, the decision on the form of financing remains an important one for 
management. 
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We come, finally, to an important but ambiguous concept, the so-called 
cost of capital. This is the rate of yield against which prospective invest- 
ment projects should be compared. That is, any investment which yields a 
rate of return greater than the “‘cost of capital" will be beneficial to stock- 
holders, while any project whose return is less than the cost of capital will 
reduce the return to stockholders. 

Strictly speaking, the cost of capital must be defined as the opportunity 
cost of money. It is the rate of return on the best alternative investment 
opportunity. So long as a project earns no less than would be returned by 
the best alternative use of the funds, stockholders come out ahead. 

Where then are the complications? Let us consider them one at a time. 
We begin by assuming that there are no risks in the firm’s operations. 


1. Interest rates and cost of capital in a perfect capital market. Suppose 
that the firm can borrow or lend all it wants at a fixed interest rate; that 
rate is then the cost of capital. For any investment project which yields a 
return higher than the interest rate will already have been undertaken, 
financed by borrowing if necessary. Hence the highest earnings that can be 
obtained from additional funds are those which would be acquired by 
lending them out at the going interest rate. 

2. Variable cost of funds. Sometimes the cost of obtaining money in- 
creases with the amount that is acquired. In that case it is the marginal 
cost of borrowing which constitutes the cost of borrowing, for to maximize 
return, the marginal yield of the investment must be equated to the 
marginal cost of borrowing. 

3. Pure capital rationing. If the firm can neither obtain money from 
the outside nor dispose of funds elsewhere, then the cost of capital becomes 
the marginal return of money in the most profitable internal use available. 
If the investment decision problem is formulated with the aid of linear 
programming, as was done earlier in the chapter, then duality theory can 
help us, in principle, to determine the cost of capital. If M, is the amount of 
money in period ¢ and D, is the optimal value of the corresponding dual 
variable, then duality theory tells us that D, is the (highest attainable) 
marginal yield of money in period £, and so it must represent the cost of 
capital in that period.?* 


24 Unfortunately, the calculation of the dual values in this problem is somewhat 
complicated. These values depend on the discounted present values of the various 
alternative projects (the coefficients of the objective function), but, simultaneously, the 
discount rates depend on the dual values. 
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4. Cost of capital in a Modigliani-Miller model. We introduced a risk 
index earlier into a Modigliani-Miller model and concluded that all equally 
risky firms in the construct must produce the same rate of earnings per 
dollar of money investment in accord with Equation (8). It is easy to show 
that in this model the cost of capital isr, the rate of return on investment for 
firms of the given degree of risk. This is so even if investment is financed by 
the sale of bonds (borrowing) at some interest rate 7 which is lower than r.?? 
This result gives rise to the following paradox. Suppose our firm normally 
earns 8 per cent and is given a chance to borrow funds at 3 per cent which 
it can invest in the company at a return of 7 per cent. Management would 
then be making a serious mistake to undertake this apparently profitable 
transaction! For though it would increase the stockholders’ earnings, it 
would not do so by an amount sufficient to offset the risk cost of the in- 
creased leverage produced by the new bonds, and hence stockholders would 
lose out on balance. This result follows rigorously and inescapably from the 
Modigliani-Miller premises. It should be said, in conclusion, that Modigliani 
and Miller feel their model and all its conclusions are widely applicable in 
the real world as reasonable approximations to the facts, though this 
evaluation is not universally accepted. 

In this section, then, we have examined the important concept of the 
cost of capital. It is at the heart of the investment decision problem, for 
without this measure we can neither discount correctly nor decide on the 
proper amount of investment and the optimal selection of investment 

` projects. We have seen that only in the simplest of circumstances is this 
measure given by the rate of interest, as was assumed in much of the discus- 
sion earlier in this chapter. Where (as it is in reality) the market for funds 
is more complex than the perfect capital market of pure competition theory 


25 For consider an investment costing I dollars which yields a rate of return,.k, per 
dollar of investment. By (8), then, the initial value of the firm, Vo = So+ Bo, is given by 


Vo = Eo/r. 
But after the investment the new value, Vi, becomes 
Vi = Ei/r = (Eo + kI)/r = Vo + kI/r, 
so that the company’s common stock will now be worth 
Sı = Vi — Bi = Vi — (Bo + I) = Vot+ kI/r — Bo — I. 
But since So = Vo — Bo, this becomes 
Sı = So + kI/r — I. 


Hence the stockholders lose money if and only if k < r, and they make money if and 
only if k > r, i.e., if and only if the rate of return on the new investment, k, exceeds r, 
the rate of return for firms of this risk class. T'his is so no matter what the bond rate of 
interest, i! 
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and is affected by taxes and transactions costs, more subtle measures of the 
cost of capital are required. We have discussed the appropriate concept in 
a number of interesting cases which typify a wide variety of circumstances, 
but it should be obvious that there remain even messier conditions in which 
still other cost-of-capital constructs are likely to be appropriate. 


12, Concluding Comment 


This chapter has described some of the analytic tools which have been 
used to examine the company’s investment decision and the means for its 
financing. It appears to follow from the discussion of our last sections that 
the decision on the nature of the physical investment project and its timing 
is likely to be more important for the company’s welfare than is the selec- 
tion of the pattern of financing, though, in practice, the latter may well 
influence the former. In any event, the development of powerful tools 
appropriate for the analysis of investment projects has gone much further 
than the design of methods for dealing with financing problems. We have 
seen the many types of problem which arise in the determination of invest- 
ment plans and have gone over in considerable detail methods which can 
be used for their investigation. In the case of financial decisions, we have 
only been able to specify some of the alternatives and to provide some 
perspective on the significance of the choice among them. 


APPENDIX: CONTINUOUS COMPOUNDING AND DISCOUNTING 


The differential calculus is a very useful technique for the capital 
theorist. However, the discounting formulae of Section 4 do not lend them- 
selves readily to differentiation. For example, to find the optimal duration 
of an investment, t, we might want to differentiate with respect to ¢ an 
expression such as 


C= Ro + DR, + D?Ra +--+ DR. 


There are two difficulties. First, the derivative dD‘/dt of a term such as 
D' is given by a fairly messy and inconvenient formula. Second, and more 
serious, is the fact that-in annual compounding t cannot be changed con- 
tinuously. Rather, it must be varied in one-year jumps. And in going from 
t to t+ 1 (the next higher value of t), our sum changes from 


Ro+ DR, + D?R2+---+ D'R, 
to 
Ro + DR, + D?R2+---+ D'R, 4- DH Ra 
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That is, the term D'*! R,,, is suddenly added to the series. Such abrupt 
jumps make differentiation impossible. 

To overcome these difficulties ¢ must be permitted to vary con- 
tinuously. This can be done in the following manner. If we were to reduce 
our period for compounding from a year to half a year, t could be varied 
from ¢ to t+ $ instead of having to go all the way to t + 1. Then our 
expression for gross earnings after ¢ years, instead of being given by 
P(1 + 2)', becomes 


PQ + 4/2)", 


Similarly, with quarterly compounding, ¢ can be changed by still smaller 
amounts, and the compound interest expression becomes 


P(1 + 7/4)*, ete. 


The direction in which we are heading should now be obvious. The 
object is to conceive of an unceasing compounding process which goes on 
at every moment so that, instead of moving in jumps, £ can be varied 
continuously. We define the number ?* e as the limit of the expression 
(1 + 1/n)" as n approaches infinity—i.e., it is the yield on a dollar (P = 1) 
invested for one year (t = 1) at a 100 per cent rate of interest, if interest 
is compounded continuously. Thus, P dollars invested for t years at this 
rate of interest will yield with continuous compounding an amount 


(1) A = Pet. 


Finally, if the continuous interest rate is to be r rather than 100 per cent, 
the expression becomes?” 
A. = Pert, 


It is now a simple matter to define discounted present value in terms of 
continuous compounding at an r per cent interest rate, for suppose P, 
dollars are to be received ¢ years in the future. In ¢ years, Po dollars will 
grow into Poe’. Then if Po is the correct present value of P., we must have 
P, — Poe", ie, 


(2) Po = (1/e")P, = e7tP,. 


26 It can be proved that e is approximately equal to 2.718. 
27 For the (instantaneous) interest rate is given by (1) as 


dA/dt 


x ^ rPe"/Pe* = r. 
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This basic expression has the advantage of an extremely simple differentia- 
tion formula: 


dP o/dt = —re *'P,. 


Illustration: The standard elementary examples of the point-input, 
point-output cases are the growing of wine and lumber. Letting the tree 
grow older means that we obtain more lumber from it, and, up to a point, 
as wines age they grow more valusble. What is the optimal age at which 
to consume such an item? 

We have the marginal condition (attributed to W. S. Jevons) which 
states that, optimally, wine (or lumber) should be permitted to age until 
the point where diminishing returns reduce the percentage marginal yield 
of aging down to the level of the (per cent) rate of interest. To show this, 
let the value, V, of the total product be a function of the amoun’ invested, 
I, and the length of time, t, for which the investment runs. This function 
may be written V — f(I, t). 

The anticipated profit, II, of the businessman at the date he makes the 
investment is the present value of V, less the cost of his investment, I. 
Since the value, V, which he receives for his product only accrues to him 
at a point ¢ periods in the future, it must be discounted at the appropriate 
interest rate, r, to obtain its present value, which, by the usual formula 
[Equation (2), above], is given by Ve". 

Thus, the anticipated profit from the transaction is 


= Ve — IT = fd, de — I. 


Given the amount of the initial investment, F, we maximize II by 
setting the partial derivative with respect to ¢ equal to zero to obtain 


or, dividing through by e^ *' and rearranging terms, 


(4/0) _ 
f , 


which is the result we are seeking. It states that the relative yield of an 
increase in ¢, (0f/8t)/f, must in equilibrium be equal to the continuously 
compoun:ed interest rate. 
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Besides seeking to explain investment decisions, capital theory 
also includes a second central topic: the determination of the rate of interest 
—the return to saving and investment. This, in turn, is taken to explain 
two phenomena of critical interest to society. First, it determines the share 
of national income going to capitalists rather than to workers. Second, it 
apportions output between present and future generations, with the level 
of savings (foregone consumption) today constituting the quantity of 
resources available for investments that permit larger outputs tomorrow. 

This chapter will first examine more carefully the meaning of “Capital” 
and the measurement of its quantity. Next, it will show how the marginal 
productivity theory determines the rate of interest in its general equilibrium 
model and how this, in turn, regulates saving and investment behavior. Then 
we will examine some standard neoclassical models to show how the interest 
rate can affect the capital-output ratio, the productivity of labor, and other 
characteristics of the productive technology. Finally, we will review the 
grounds on which the Anglo-Italian school associated with Cambridge 
University has attacked the generality of these conclusions. Until recently, 
it was thought that low interest rates necessarily provide a powerful 
stimulus for economic abundance, always leading to the adoption of tech- 
niques using large quantities of capital relative to labor, and hence always 
increasing the productivity and incomes of the workers. This view of the 
matter has become a central issue in the discussion between the two Cam- 
bridges. While examining the substance of the debate we will also see how 
capital theory conducts its analysis. 
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The discussion will ignore risk and uncertainty, for we will have enough 
material to cover without dealing with the resulting complications. 


1. Time as a Requisite of Production Processes 


Early in the history of the labor theory of value it became clear that 
commodity prices cannot be explained simply by the amounts of physical 
labor needed to produce them. If one item, A, takes ten hours of labor to 
produce and is instantly ready for use, while another, B, also employs ten 
hours of labor but then requires some time before it is usable (e.g., the 
period required for lumber to dry after the trees are cut down), then item 
B will usually sell at a higher price than A’s. For a businessman who is to 
be induced to invest in the manufacture of product B knows that he must 
not only lay out the wages needed to produce it—he must also be prepared 
to see his cash tied up for some considerable period. 

From this observation two things are clear: first, that the mere passage 
of time can be a crucial requisite of production—indeed, that time can, in 
a sense, be considered an input very much on a par with labor and raw 
materials; second, like any other input, time has its price which is usually 
measured by the rate of interest. 

Physical capital is defined to consist of all inputs used in the production 
process which are themselves products of the economy. Thus, it includes 
plant, machinery, inventories of raw materials, partially finished goods, 
and even inventories of finished commodities which must be kept on hand 
to make it possible to fill orders as they come in. 

Any good whose manufacture makes use of capital must involve the 
passage of time in its production. For a capital asset is defined as a means 
of production which was itself produced before being put to its current use. 
Work on today’s newspaper must already have been underway when metal 
was being mined for the production of the linotype machines, when the 
trees for making the paper were being planted, and so on. 

Capital can be considered the input whose use inherently involves the 
passage of time. Any other input whose work requires time must by 
definition utilize capital in the form of goods in process. If a farmer produces 
a cheese which must be left to age before it can be sold, the product of his 
labor, in its unfinished state, becomes capital, a produced means of produc- 
tion. Moreover, the longer the cheese is left to ripen, the greater will be the 
capital cost of the project. 

All of this means that the amount of capital employed in a production 
process cannot be measured by a single number, say, by its pecuniary 
value. One must also specify time measures along with such values. If the 
production of a unit of item C employs a $20,000 machine for ten minutes, 
while a unit of D needs eighty minutes on this same $20,000 machine, it is 
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clear that, other things being equal, D employs more capital in its manufac- 
ture than does C. The measurement of capital is therefore likely to involve 
at least two variables—quantity and time—and in general even two will 
not be enough. In principle, the analysis of a productive activity requires 
specification of all its inputs and outputs along with their dates. 

This also suggests that there are two basic ways in which the use of 
capital can be increased—we can either use more of the same type of 
capital (this is called capital widening) or we can-switch to processes which 
involve longer investment periods (capital deepening). For example, we 
can widen our use of capital by employing 100 shovels (and 100 men) 
instead of 20, or, alternatively, we can substitute power-digging equipment 
for the shovels (and for some of the men), and since a steam shovel pre- 
sumably takes more time to produce and to use up than do spades, this is 
an act of capital deepening. Such process, with relatively high capital- 
output and capital-labor ratios, is said to be relatively capital intensive or 
simply capitalistic. Unfortunately, as we will see, matters are not always 
so simple, and it will not always be possible to tell which of two production 
processes is the more capital intensive. 


2. Heterogeneity and Homogeneity of Capital: Putty vs. Clay 


In some analyses capital is treated:as a disembodied malleable sub- 
stance. It is then treated as though it were homogeneous, movable at will 
from one sector of the economy into another, and usable in any of them as 
part of its productive process. This versatile substance has been referred 
to in recent discussions as "jelly" or “putty.” At the opposite end of the 
spectrum is the interpretation of capital as a collection of vastly differing 
concrete objects: machines, factory buildings, unfinished products (work- 
ing capital), and inventories of finished goods. There are also compromise 
“putty-clay” models in which capital starts out malleable but is then 
formed into specific objects and permitted to harden into shapes which 
can no longer be modified. 

Both viewpoints have some truth to them. Obviously, capital does 
always take the specific forms of the clay models. Yet in the long run it 
has considerable mobility. If a new type of equipment or a new industry 
shows itself more profitable than an old one, then the older capital will not 
be replaced as it wears out. Instead, the depreciation funds will be ''dis- 
saved” in this process, and the real resources that they represent will, 
sooner or later, be reinvested in the new equipment or the new activity. 

We can be mislead either by an analysis which abstracts completely 
from the specificity of capital assets or by one which concentrates on those 
differences and loses sight of the capability of resources to move to more 
profitable uses, the fundamental equilibrating mechanism of the com- 


642 Capital and Distribution Theory Chapter 26 


petitive process common to classical, neoclassical, and Marxian analyses. 
The heterogeneity of capital goods means that a macroanalysis of produc- 
tion and distribution in terms of a few aggregative inputs such as labor 
and capital may be dangerous, though the seriousness of that danger is 
still a matter of lively debate. It is this problem that has forced much of 
recent neoclassical analysis to the use of full general equilibrium models 
treating each producers’ good as a Separate commodity, each with its own 
supply-demand relationships but which can generally only be "solved" 
and analyzed simultaneously. 

In the absence of uncertainty, as assumed in this chapter, the putty- 
like property that capital acquires in the long run means that, under pure 
competition with freedom of exit and entry, all assets must yield precisely 
the same net return, for no one will ever invest in an asset that yields less 
than the maximum return available. It will be convenient to refer to that 
common rate of net return as the rate of interest, 7. 


3. On Measures of Quantity of Capital: Preliminary 


Before turning to some of the results that have emerged from recent 
diseussions on capital theory, we consider several attempts to define a 
numerical measure for the heterogeneous collection of inputs employed at 
different dates which together constitute the capital of a firm or an economy. 

From the beginnings of neoclassical theory there have been attempts to 
define the quantity of capital. For example, Bóhm-Bawerk, starting from 
the view that capital is essentially time, defined what he called the "average 
period of production" involved in a particular production process. He 
measured this as a weighted average of the time intervals between the 
utilization of the different inputs and the emergence of the final output. 
The weights can be the physical quantities of the inputs if they are homo- 
geneous and so can be added together (e.g., if they all consist of a certain 
number of man-hours of labor of given skill), or these weights can be the 
values of those inputs. For example, if manufacture of a produet ties up 
$100 for two years plus $50 for 1 year, the average period of production is 
(2 years x $100 + 1 year X $50)/($100 + $50) = 5 years. 

"There are all sorts of objections to this measure, not the least of which 
is that a given input (e.g., a durable machine) can be used to produce a 
variety of outputs and we simply do not know what portion of that 
machine's cost to attribute to each product. More important, this measure 
of Böhm-Bawerk makes the error of forgetting the necessity of discounting 
if outlays at different dates are to be compared and added together. As 
we saw in the last chapter, it is not legitimate to evaluate the tying up of 
$100 for two successive years as twice $100, i.e., as equivalent to $200 tied 
up for only one year. 

A second approach utilized by the general equilibrium analysis gives 
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up the attempt to measure aggregative capital and simply treats its com- 
ponents separately, item by item,with drill presses not compared directly 
with switchboards, but each being treated as a separate intermediate 
product whose quantity is measurable directly and is determined by the 
general equilibrium process. 

However, for some purposes this is an unsatisfactory solution. Just as 
it is often convenient to have a single index number to represent the more 
or less independent movements of a large multiplicity of prices, it can be 
useful to have a single measure of the complex of heterogeneous items 
comprising the capital stock of a firm, an industry, or even an entire 
economy. We have long ago given up the unrewarding search for an ideal 
index number since no single number can tell us correctly how all items in 
a large collection of prices are behaving. Similarly, any measure of capital 
is bound to give rise to anomalies. Yet it can be argued with some per- 
suasiveness that the only thing worse than the use of such an aggregative 
measure is complete unwillingness ever to adopt one; for, without it, 
macroeconomic or econometric analysis of problems involving capital may 
be difficult or even impossible. 


4, Capital Measurement: Dated Inputs and the Stationary State 


A widely used approach to capital measurement (one which we will use 
later) goes back to Wicksell at the turn of the century. It has been called 
the dated labor or dated input approach. It is a measure of competitive 
equilibrium value of the capital stock, rather than, in any sense, of its 
physical quantity. It is therefore dependent on the pricing (market 
valuation) process and, as we will see, upon its choice of interest rate. 
Some anomalies may therefore result if prices or interest rates change, 
because we may be forced to say that-the quantity of capital has changed 
even though the stock of physical capital has not varied at all. 

To illustrate the approach, suppose a particular process requires four 
years to complete and each unit of output requires a sequence of outlays, 
zı dollars in its first year, t2 in its second year, za in the third year, and 
x4 in the fourth and final year of the production process. Assuming prices 
are not changing, how much cost, in total, can we say the finished product 
incurs? The answer is that this cost, which is also the value of the product 
in competitive equilibrium, is equal to the money value of each of these 
outlays plus the interest on the amounts the producer has invested in them. 

The competitive equilibrium cost of a product will be equal to the sum 
of the values of the inputs needed to produce it plus the accumulated 
interest on each such outlay: 


(1) c = zi +0? + 21 +2)? + rll + i) + ma. 
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Because z is expended three years before the product is finished, by the 
end of the process it has had an opportunity cost of three years’ compound 
interest, so that its accumulated value must now be zi(1 + 2)?. A similar 
argument yields the remaining terms of the expression (1), which repre- 
sents the labor and interest cost (value) of a unit of the finished product. 

However, if this production process is not a one-shot affair, the value 
of the firm’s total capital must include considerably more than that of its 
finished product inventory. In addition to the value of the finished product 
(1), it must include a considerable investment in goods “in the pipeline," 
some of which will emerge from the production process the following year, 
some the year after that, etc. The value of the firm's investment in these 
unfinished items must be added to value of the firm's inventory of finished 
products to determine its total capital. 

We can illustrate this calculation by use of a construct, the stationary 
Slate, which is often employed in capital theory. Suppose the firm wishes 
to keep a steady flow of y units of output per year, each of which takes four 
years to complete. Then, at the end of each production year the company 
must have y units of goods that are completed (i.e., they were begun four 
years earlier), y units that are three-quarters complete (i.e, that were 
begun three years earlier), y units that are half complete, and y units that 
are just one-quarter finished. 


To determine the total amount of capital all these finished and un- 
finished items represent we must find the cost (competitive equilibrium 
value) of each of them. For this we must know the outlays of labor (and 
their dates) per unit of output for the various goods in process and for the 
finished items. These data are given in Table 1a. It Shows, for example, 
(last line), that goods å finished have so far had expended upon them only 
their first set of inputs, whose cost is 1. Similarly (second line from the 
bottom) items 3 finished have so far received two doses of inputs, z, one 
year ago and zz in the current year. Table 1b includes the interest cost 
incurred by having these resources tied up by the expenditures specified in 
Table 1a. Naturally, the farther back the expenditures go, the larger the 
accumulated interest that represents their opportunity cost, so that outlays 
that occurred two years ago must be multiplied by (1 + z)?, ete. 


TABLE 14. DATED INPUTS FOR FINISHED AND UNFINISHED GOODS 


————————————————— 


Batch Date of 


of Goods Outlay 3 Years Ago — 2 Years Ago 1 Year Ago Now 
————————————————— 
Finished Tı T2 T3 L^ 
$ Finished 0 zi T2 T3 
1 Finished 0 0 E Z2 
4 Finished 0 0 0 Zi 


———  GMÉÓRERRRÉEÉMÁME 
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TABLE 1B. DATED INPUTS AND ACCUMULATED INTEREST: 
FINISHED AND UNFINISHED GOODS 


———MM——M MH À—H—Q 
Batch Date of 
of Goods Outlay 3 Years Ago 2 Years Ago 1 Year Ago Now 


Finished a,(1 + i)? zo(1 + 7)? 23(1 + i) 

i Finished 0 a(l + 7)? za(1 + i) T3 
3 Finished 0 0 z(l + i) EZ] 
4 Finished 0 0 0 Zi 


Now, the sum of all of the terms in Table 1b is the total value of the firm's 
capital per unit of output—the amount it must keep tied up in its produc- 
tion process year after year if it is to have a stationary annual output of 
one unit. 

In summary, in a stationary state, with a process involving outlays of 
21, £5, 23, and z4 in the first, second, third, and fourth year of work on a 
product, the total value of the firm's capital will be yK, where y is its out- 
put and K is its capital per unit of output and is given by the sum of the 
values of the inputs that have gone into its inventory plus the accumulated 
compound interest on each such outlay. Adding together all the items in 
Table 2 we see that this quantity of capital per unit of output is given by 


(2) K = zy(1 d- 2? + (214-23) 0 + 2? 
+ (1 + 22 + 23) (1. + 2) + (1 + 22 + T3 + 24). 


Obviously, the value of the firm's capital per unit of output, as given by 
(2), will be greater than the value of a unit of its finished product, as 
shown in (1), since the capital includes goods in process as well as finished 


product inventory. 
From this calculation we can determine the firm's capital-output ratio 


simply by dividing (2) by (1). 

We can also caleulate the value of the firm's annual expenditure on 
inputs (labor) from either the last column of Table 1a or that of Table 1b, 
for we know that in (each) current year the firm spends z4 on each unit of 
goods which it is finishing, x3 on each unit of goods which will be $ finished, 
etc., so that, per unit of output, its total expenditure on inputs in (each) 
current year will be 


(3) E = t, + z2 + T3 + T4- 
Since we are dealing with a stationary state in which expenditures are 


replicated precisely every year, the amount given by (3) must represent 
the firm’s annual expenditure on inputs. In a simple model in which the 
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only inputs are labor and capital one can then calculate a labor-output 
ratio by dividing (3) by (1) or a labor-capital ratio by dividing (3) by (2). 

We observe, finally, that all these ratios are dependent on the value of 
i, the rate of interest, as is seen at once from the formula for value of output 
(1) and from that for the value of the firm’s capital (2). This means that 
even if there is absolutely no change in a particular productive process, 
the quantity of the firm’s capital, as we have defined it, will change if 
there is a change in the interest rate and so will its capital-output and its 
labor-output ratios. Thus, a given collection of buildings, machines, and 
goods in process whose physical makeup does not vary one iota will 
become a different quantity of capital (as evaluated by the dated input 
method) from what it was before a change in the interest rate. In other 
words, if there is a change in income distribution which modifies the rate 
of return to capital, then the valuation of the stock of capital will also 
(naturally) be affected. This illustrates a point that was mentioned earlier. 
The dated input method yields a measure of competitive equilibrium 
value of the capital stock and so does not correspond perfectly with any 
physical concept of capital. 

It has also been suggested that this interrelationship between valuation 
of capital and rate of interest brings with it an unfortunate problem of 
circularity, for the quantity of capital plays a role in the process of deter- 
mining the interest rate, but the interest rate, as we have seen, in turn 
affects the valuation of the quantity of capital. However, this in itself 
need not be a serious difficulty. It is a characteristic of all systems which 
require simultaneous solution and in which the value of every variable 
can affect that of every other. 


5. Determination of Interest Rate in the General Equilibrium Model 


Having discussed the measurement of capital we turn next to a more 
substantive issue, the determination of the interest rate. Today’s neo- 
classical analysis determines the interest rate as just one element in the 
array of prices that emerge from the competitive general equilibrium. There 
is a simple but clever device which permits the analysis of an intertemporal 
equilibrium over a finite number of periods to be translated into that of a 
single-period case. In the latter, we have y;,---, Yn representing the out- 
puts of the n commodities produced by the economy. Corresponding to 
each such variable y+ in the single-period model, in multiperiod analysis 
we utilize h variables yz1, yr2; * * * ; Yea, thus considering only some finite 
number! of periods, h, where h is called the horizon. 


1 While the choice of any finite value for h is arbitrary, it need not be a serious prob- 
lem. In principle we can take the year } to lie billions of years in the future—beyond the 
likely span of human life. 


——————————PÓáo 
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For example, if commodity k is shoes, yz: is the output of shoes in 
period 1, yx2 is the output of shoes in the second period, etc. For purposes 
of this analysis we treat yi; and y;» as the outputs of two distinct com- 
modities not necessarily having more relation to one another than yx: and 
Yj say the outputs of shoes and slippers (or shoes and bananas) in the 
same period, t. To each such variable there corresponds its own pair of 
supply-demand relationships and its own price. Thus the system involves 
the n X h output variables 


(4) (11) 7o Yim 21) 7 7 Yor Yni Jae) 


and the n X h prices 
(5) (p, ^ Pih pano P2857 Pots 0o Pn). 


To treat the determination of the values of these variables as a problem 
for simultaneous solution, that is, to render them directly comparable, 
these prices must all be expressed in terms of discounted present value. Having 
done so, one can, in principle, solve the general equilibrium problem defined 
by the set of supply, demand, and production relationships in exactly the 
same way as one solves the problem with n output and n price variables 
for a single period. 

Let us see now how that solution and its discounted prices also give us 
the equilibrium interest rate. As in the preceding chapter, let D — 1/(1-4- 3) 
be the discount factor so that the present value of u dollars receivable one 
period in the future is p = Du. 

Let k be a particular resource (commodity) and let its undiscounted 
future price in period t + 1 be written wr41. We now take k to be the good 
in terms of which all intertemporal calculations are made; i.e., we use it as 
our standard of intertemporal price measurement, thus defining its price to 
remain constant?; i.e., for this good, its undiscounted future price, Ukt+1, 
is defined to be the same as its present price, pre. Therefore, the discounted 
value of that good’s future price 


(6) Pagi = Dues, 
must satisfy 
(7) Deepa = Dpa = pa/( + 2). 


If one has determined p; and prt+ı in the general equilibrium analysis, we 
can solve the preceding equation for the implied interest rate, 7, obtaining 


2 For example, k can be money, whose unit price in every period is defined as unity. 
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(8) i = (pa/Di1) — 1. 


Thus the interest rate is always implicit in the general equilibrium calcula- 
tion of prices. Consequently, if one accepts the analysis of price determina- 
tion in a static general equilibrium model, one must automatically accept 
the neoclassical model of interest determination. One can argue that both 
of them neglect important real-world phenomena such as imperfect com- 
petition and institutional constraints, but that is a criticism applicable to 
most of microeconomics and not to neoclassical capital theory alone. 


6. The Interest Rate and Producers’ Demand for Investment 


Since the interest analysis is, ultimately, a supply-demand model, we 
can learn more about it by examining more closely the capital supply and 
demand relationships, i.e., the relationship between the determination of 
desired investment and the desired saving which can be taken to supply 
the resources for that investment. We begin with the demand side—the 
demand of producers for net investment (additions to the stock of capital). 

In neoclassical theory every act of investment is treated as a trade-off 
between present and future consumption. Labor and raw material which 
could have served consumption today are instead used to produce ma- 
chinery or other capital goods which can increase the flow of consumers’ 
goods tomorrow. 

The equilibrium conditions for the firm’s allocation of resources between 
any two periods are, naturally, the same as those for any other resource 
allocation decision by a multiproduct firm (for it will be remembered that 
shoes today and shoes tomorrow are interpretable simply as two distinct 
products of the same firm). An equilibrium in which positive quantities of 
two goods are turned out requires that the marginal rate of transformation 
of the two items be equal to (the inverse ratio of) their relative prices (see 
Chapter 11, Section 11). That is, for any two outputs j and k, if, by trans- 
ferring a small quantity of resources from product j to product k, we give 
up —dy; of the former and obtain an additional dy, of the latter, profit 
maximization requires that 


—dy;/dy; = Pi/Pr- 


Similarly, in planning for the allocation of resources between shoe produc- 
‘tion in current period, t, and that in the next period, t + 1, we require 
(again using uj:+1 to represent the undiscounted price for commodity j in 
period £ + 1) 
dyii _ Pi Pit 
dyi — pii Dujei 
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since otherwise it would be profitable to reallocate resources. If the first 
term were larger than the second, for example, it would pay the firm to 
expand future shoe production at the expense of that of the earlier period. 
The preceding equation can obviously be rewritten directly as 


Uuakidysri € 


1 
zi 
Dj dy D x 


(9) 


Equation (9) is the basie equilibrium condition for intertemporal 
resource allocations by the firm. It tells that in equilibrium the transfer 
of a dollar of investment from current production (where it can earn 
dp; j) to future output where it earns u;.41 dyj:41, must bring a net gain 
exactly equal to 1 + i, the amount that the dollar could have earned by 
being lent out at the rate of interest 7. That is, at the margin the investment 
one period later must replace its original ecst and in addition bring in a 
percentage return equal to i, the rate of interest, for otherwise it will pay 
to borrow and invest some smaller or larger amount. Thus, we have 


Proposition 1: In equilibrium the marginal net yield of additions to 
each firm’s capital must be equal to the rate of interest. 


This is the basic condition in the neoclassical model that determines 
the demand for resources for investment—the decisions of firms on the 
amounts of the resources they wish to use for investment. 

This can be represented in a diagram whose usefulness will become 
clearer presently. In Figure 1a we represent the undiscounted values of a 
single-product firm's outputs in the present period, ¢, and the following 
period, t+ 1. The curve TT’ is the transformation or production pos- 
sibility locus expressed in monetary units. A point on that locus represents 
an efficient output combination in the sense that from any point such as 
A it is impossible to increase output in period t + 1 without some sacrifice 
of output in ¢. At a point such as A the producer is, in effect, investing ST" 
units of potential output now (that is how much he *tabstains" from turning 
out now) and getting in return output valued at U dollars in the next period. 
Thus, as the producer moves leftward along the curve he is deciding to 
produce less and invest more now, in return for a larger output in the 
future. We now introduce a family of lines, corresponding to the iso-profit 
lines of the single-period theory of the firm. Each such line represents a 
constant sum of present values of current and future outputs. It is easy to 
show? that under perfect competition with prices constant for the firm, 

3 Proof: The present value of whatever output combination the firm selects is 


(10) pui pee OF pq. Dui 
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Utar Veer 


Pith Pre T 9 
(b) 


these are all straight lines with slope — (1 + 2), for a given future quantity 
of money, q:4.1, will have the same present value as the current amount of 
q: if it is exactly (1 + 7) times as large as the former. 

In Figure 1a we have several such lines, vo, v1, and vg. The highest of 
these lines that is available to the producer, i.e., the one which is most 
profitable, is attained at the point of tangency, E, between the production 
possibility locus TT” and the line vı. There the slopes of the two loci are 
clearly equal, and so we must have the slope of the transformation locus, 
dur+14¢+1/dpy:, equal to the slope of the iso-profit line, — (1 + 7). But 
with prices fixed (under pure competition) that is the same as condition 
(9), the basic equilibrium condition for intertemporal resource allocation 
by the firm. 


7. Interest Rate and the Supply of Saving: Lending and Borrowing 


Saving is taken to be the source of the resources needed to produce 
capital. It represents new materials and labor which could have been used 


where, as before, u:+1 is the undiscounted price of the product in t+ 1. Setting the 
expression in (10) equal to any constant we obtain the equation of an iso-present-value 
locus 

pagi Duisiyii = k 


that is, 
í wien = — Pu D + k/D 


or, since D = 1/(1 + i), 
(11) Ut+Yi+ = —(1 + pyet 1+ ik. 


This is clearly the equation of a straight line in our graph, with u:+1y:+1 represented 
on the vertical axis and py, on the horizontal axis, and the slope of this iso-present-value 
locus is —(1 + i). 
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for current consumption but which, instead, are held back (saved) in 
order to make possible the production of larger outputs in the future. Thus 
savings are the supply side of the supply and demand for new capital. 

The consumers’ intertemporal equilibrium relationships which deter- 
mine the supply of savings are also no different from those in a single period 
model. Equilibrium can be taken to require tangency between the con- 
sumer's indifference curve for any pair of products and the corresponding 
price line. Thus the marginal rate of substitution between any such pair 
of products (the ratio of their marginal utilities) must be equal to their 
discounted price ratio. Letting x; 1 represent the amount of a given product 
consumed in period ¿+ 1 and z, the corresponding variable for period t, 
— dz, 4 1/dz,, the absolute value of the slope of indifference curve between 
z, and 2,41 must therefore equal p:/p:41 = pi/Dui44 so that 


Vua zia 1 


a» ^N hei ti 


where dz, 41/dz, = mzu,/muz,,, is the slope of the pertinent indifference 
curve. That is, 


Proposition 2: Equilibrium of the consumer (saver) requires that his 
marginal rate of substitution between consumption in one period and 
consumption a single period later equal 1 + 7. 


Condition (12) is the equilibrium requirement for the consumer's ap- 
portionment of his resources between present and future. Intuitively, this 
is so because every dollar’s reduction in his current consumption represents 
saving which he can lend out at the current 7 per cent interest rate and 
therefore it permits him to consume an additional 1 + 7 dollars in goods 
in the following period. Condition (12) tells us that in equilibrium the £ 
dollar interest return he obtàins in this process must at the margin just 
compensate him for his subjective loss resulting from the postponement of 
consumption to the future (the substitution of dz,4; for dz;), or else it 
will pay him to change his present consumption (saving) level. 

We can learn something more by combining the equilibrium diagram 
for the consumer with that for the producer.* Figure 1b reproduces from 
Figure 1a the transformation locus T'T" and the iso-profit line vy, which is 


4 This diagram is derived from the work of Hirshleifer and Fisher. See Jack 
Hirshleifer, “On the Theory of Optimal Investment Decision," Journal of Political 
Economy, Vol. 66, August 1965, pp. 329-352. 
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tangent to TT’ at E. We left the producer in equilibrium at point E at 
which his equilibrium requirement (9) is satisfied and the present value of 
his earnings are maximized. But the producer is also a consumer—he must 
take his earnings and also divide them between consumption and saving. 
Will he necessarily be satisfied at point E in his role as consumer? 

The answer is that in general he will not. The diagram contains two of 
his indifference curves, J, and Jo, between his present and future con- 
sumption. Equilibrium for him in his role as consumer requires that his 
indifference curve be tangent to his budget (price) line vı. Here v, is his 
budget line (10) since its slope is — (1 + ) and since it goes through point 
E where he is left by his productive activity. Now there is no reason to 
expect his indifference curve, Jı, through E to happen to be tangent to v, 
at that point, as consumer equilibrium would require. Indeed, as it is 
drawn the two are not tangent there. 

Rather, his consumer equilibrium point, the point of tangency between 
one of his indifference curves and vı, may well occur at some other point, 
point H in Figure 1b. That is, as drawn it is only at point H that tangency 
condition (12) for his equilibrium as a consumer is satisfied, while tangency 
condition (9) for his equilibrium as a producer holds only at E. 

'This is not an irreconcilable contradiction. Rather it means that the 
producer-consumer must be taken to move to his final equilibrium in two 
steps. As producer he first goes to E, which maximizes his purchasing 
power as a consumer [it puts him on the highest iso-profit line the tech- 
nological possibilities (TT") permit him to attain] Then he takes this 
purchasing power and divides it between present and future as best suits 
his tastes, moving to point H, the highest indifference curve he can attain 
with this highest of attainable iso-profit lines. How does he get from E to 
H? By lending or borrowing on the basis of his earnings at E, at the given 
interest rate, 7. If his consumer equilibrium point happens to fall to the 
left of E, he will be a lender. That is, he will consume less than his current 
output at E, and the value of the decreased consumption pui — pia is 
then lent out at interest rate 7, enabling him to increase his consumption 
in the next period by GH, which equals 1 + 7 times the amount he has 
lent out. This follows from the fact that the absolute slope of v, is 1 + i, 
so that Aui dica Api =1+ior Aucaiyigi = (1+ 2) Apiye 

Similarly, if his consumer equilibrium point on price line vı happens to 
fall somewhere to the right of E, say at K, it means that he will borrow to 
get from E to K and will consequently not be able to consume as much in 
the following period as he would have at E. 

Thus, we have seen in this and the preceding section how the interest 
rate affects the behavior of producers and consumers, investors and savers. 
These relationships constitute component parts of the general equi'ibrium 


— 
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model which, as we saw earlier, can be taken to determine the interest rate 
which determines the savings-investment behavior we have just investi- 
gated. 


8. Digression on Monetary Interest Theory 


The reader may have been given the impression that the neoclassical 
theory operates in a totally different world from more recent approaches 
to interest analysis, for our discussion has been framed largely in terms of 
“real”? magnitudes—the supply of saving (defined to consist of goods 
produced but not immediately consumed) and the demand for resources 
for investment (connoting the construction of machines and the accumula- 
tion of physical inventory, etc.). Where in all this is there room for con- 
cepts such as Keynesian liquidity preference, with its heavy emphasis on 
the role of money in the determination of interest rates? 

Actually, classical and neoclassical theorists were well aware that 
interest is, in a way, a monetary phenomenon and that the rate of interest 
is affected by monetary occurrences. Indeed, their conception of the process 
involved was not entirely foreign to current views on the subject. However, 
in their opinion, monetary influences on interest rates were essentially 
transitory. 

Let us outline their model briefly, roughly following the position of 
Alfred Marshall on the subject. For this purpose let us examine what 
happens if, starting from a position of full employment and general 
equilibrium, there is a sudden increase in the money supply, all other things 
remaining equal. Marshall readily concedes that this will cause a reduction 
in interest rates because there will now be more cash than people will wish 
to hold at the initial high interest rate, and so there will be a rise in the 
supply of loans (including the willingness to purchase bonds as one im- 
portant form of lending). 

However, the new lower interest rate will make it profitable to employ 
more roundabout processes. More money will be borrowed for inyestment, 
and more labor and raw materials will be demanded for the pürpose. The 
result will be a bidding up of wages and prices. ; 

But this very increase in price level must serve effectively to decrease 
the money supply. For example, a doubling of prices will cut the puzchasing 
power of a given stock of money in half. We say that in this way rising 
prices serve to reduce the real money supply. 

Thus the initial increase in money supply sets into motion a train of 
events by means of which it, at least partly, eliminates itself. How far will 
this process go on? The answer is that it will continue until the real money 
supply and, hence, the rate of interest have been returned to their original 
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levels. So long as the real money supply is above its initial level, the rate of 
interest will remain depressed below the level which equates supply and 
demand for capital-producing resources. But this means that more re- 
sources are demanded by investors than are supplied by savers—there 
must be an excess demand for labor and raw materials to be used in the 
manufacture of capital goods. For this reason, the inflationary process 
must continue on until the cause of the disturbance, the increased money 
supply, has eliminated itself completely. Thus, the neoclassical theory is 
perfectly consistent with a short-run monetary analysis of interest rates. 
It maintains, however, that in the long run the influence of monetary 
events must eliminate itself and that interest rates must always return to 
the levels determined by the two relevant "real" phenomena, the supply 
of savings as governed by time preference and the demand for savings as 
determined by the marginal productivity of more roundabout processes, 
i.e., the marginal productivity of investment. 

We observe that the full employment assumption implicitly plays a 
very important role in this neoclassical argument, for if there were ex- 
tensive supplies of unemployed resources, an excess demand for these 
resources would very likely lead to more employment rather than to 
higher prices. In this way, the Marshallian model is apt to break down in 
a depressed economy. 


9. Diminishing Returns and the Neoclassical Savings-Investment Parable 


We now have at our disposal a formal model of the deter- 
mination of the interest rate and the supply-demand relationships 
which it encompasses. We come now to the crucial question—what, if 
anything, does this model tell us about patterns of economie behavior? 
In this section we will describe some such results that do hold at least in 
certain cases. Until recently it was thought that they were valid generally. 
But as we will see, it is now agreed that they do not hold universally. The 
remaining issue is whether, in fact, they apply very widely or whether they 
represent only a very special sort of case. 

Sections 9 and 10 gave us marginal (first-order) conditions (9) and (12) 
for equilibrium of saving and investment in a neoclassical world. As usual, 
to assure us that we are at a true maximum when these requirements are 
fulfilled, we need second-order conditions (corresponding to the require- 
ment in a one-variable model that the second derivative of the maximand 
be negative). And, as usual, these conditions involve requirements about 
the concavity of the consumers’ (intertemporal) utility functions, con- 
vexity of the (intertemporal) production relationships, etc. As in most of 
this volume, we will not review these second-order requirements in detail. 
Rather, the issue we turn to now is something these second-order con- 
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ditions were until very recently, thought to imply but which has been shown 
to be a non sequitur in the Cambridge (England)-Cambridge (Massa- 
chusetts) discussion referred to in the chapter on distribution. 

Implicit in the old-neoclassical analysis is a story involving increasing 
disutility of postponed consumption and diminishing returns to invest- 
ment. Larger quantities of savings are assumed to involve a greater 
marginal disutility because the individual in question is thereby required 
to postpone the current consumption of items of increasing importance to 
him. When he is saving just a little, a small addition to saving only requires 
him to postpone the consumption of frivolities which matter very little 
to him. On the other hand, when he is saving a great deal his current 
consumption will already have been stripped down to a bare minimum of 
necessities, and the marginal cost of further saving will therefore be very 
high. The moral of this portion of the story thus appears to be that if an 
economy wants its members to save it must be prepared to pay the price 
in term of a higher interest rate. A high interest rate is the necessary 
reward to savers, one that is required to get them to save as much as is 
desirable for the social welfare. 

The remainder of the tale relates to the allegedly diminishing returns 
to investment. Here it is held that if there is an autonomous increase in 
saving, with the highest net return, as more and more resources become 
available, they will have to be used in increasingly inferior investment 
opportunities. When only small quantities of resources are available they 
will be invested in activities which yield, say, 20 per cent or more, but if 
more resources are provided for investment, since the activities yielding 
20 per cent will have been used up, society will have to be satisfied with 
others that yield perhaps only 15 per cent, and so on. Looked at the other 
way, this says that if there is no change in technology or any of the other 
production conditions, producers can only be induced to invest additional 
resources by offering them at a lower price (a lower interest rate). The 
demand curve for resources must be negatively sloping, and, other things 
being equal, a higher interest rate will always be associated with a lower 
quantity of investment. 

These two key parts of the neoclassical story—the rising relative 
marginal disutility of saving and the diminishing marginal yield to invest- 
ment—may actually be true, but what the recent Cambridge-Cambridge 
discussion has shown is that their truth or falsity can only be settled 
empirically. As we will see presently, they do not follow from the usual 
theoretical premises. This issue is of some importance because if the as- 
sumptions are false we can no longer take ourselves to be living in a world 
comfortable for analysis in which increased interest rates are required as 
a reward to savers to elicit from them more resources for society's invest- 
ments, nor ean we assume that declining interest rates will always lead 
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producers to invest more and thus to provide more for the future at the 
expense of the present. 

We will say little more about the role of savings in the neoclassical 
analysis and give only a short summary of the criticisms that have been 
levelled at it. Then we will turn to a far more detailed review of the other 
side of the neoclassical capital theory—its rich investment models. 


10, Institutional and Sociological Determinants of Saving 


The savings side of the Cambridge attack upon the neoclassical tale can 
be summed up briefly. The Anglo-Italian group has argued that in reality 
a rise in the rate of interest will not produce a significant increase in saving. 
Partly this is so because the income effect and the substitution effect of a 
rise in interest rates may plausibly be taken to work in opposite directions, 
as was shown in the chapter on income distribution. The substitution effect 
of increased interest does work to elicit more saving. But higher interest 
rates also increase savers’ real incomes and enable them to satisfy their 
accumulation objectives (say, the desire to accumulate enough to buy a 
house or a car) at a lower rate of saving than they would have otherwise. 
The net result may well be that a change in interest rate produces a 
negligible effect on the amount saved up by society. 

If not the interest rate, what then does determine the level of saving? 
In the Cambridge, United Kingdom, view it is largely decided by several 
institutional influences: 


(a) The decisions by business management determining the 
amounts of their profits that they will plough back into their firms— 
the substantial amounts of savings of stockholders’ resources carried 
out by firms themselves. 

(b) The division of national income between workers and 
capitalists. Since capitalists are taken, as part of their historical 
function, to be accumulators par excellence, they are assumed to 
have a higher marginal and average propensity to save than workers. 
Thus with a given level of national income, the larger the share of it 
going to capitalists, the more society will end up saving. This is the 
foundation of the Kaldor model of income distribution described in 
the chapter on distribution. 


Now there is no doubt some truth to both these assertions. 
we do know that a very substantial share of the saving that takes place in 
the United States takes the form of plowback by the firm, which thereby 
holds back profits from the hands of the company’s stockholders and 


For example, 
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appears to decide for them how much of this flow of company earnings 
should be reinvested. However, this relationship need not be just what it 
seems at first. Even with firms making plowback decisions, stockholders 
are not altogether powerless to control their total saving. Suppose a stock- 
holder wanted to save $125 in a given year but finds that the firm in which 
he holds shares has retained $50 of his earnings that year. By saving only 
$75 instead of $125 as he had originally planned, he can still keep his total 
savings to the level he originally desired. The critical question is whether 
stockholders really do or do not make such offsetting adjustments. If they 
do, the level of saving will still be determined by such individuals and 
their response to financial inducements such as the interest rates. If they make 
no such adjustment, an important link in the neoclassical chain connecting 
saving and interest rates will have been broken. However, whether or not 
savers as a group make such an adjustment simply cannot be determined 
by a priori surmise. It is a matter for empirical investigation, and the issue 
is still far from being settled.? 


11. On Some Simple Technologies and Neoclassical Investment Models 


The other side of the Cambridge, United Kingdom, attack—its objec- 
tions to the neoclassical story on choice of investment technique—is 
considerably more complex. In order to understand these objections, we 
must first examine in greater detail the relevant portions of the earlier 
neoclassical discussions, notably in the work of the Austrian economist, 
Eugen von Bohm Bawerk (1851-1914). We will deal with cases in which 
the Cambridge problems do not arise. In the next section, by comparison 
with these straightforward situations, we will see clearly how, in other 
situations, difficulties can arise. 9 

We will see how, in the neoclassical models, reduced interest rates 
always bring with them production processes that are more capital intensive 
and which (in the long run) yield higher outputs per member of the labor 
force. Thus, reduced interest rates become a prime instrument of increased 
productivity and rising living standards. They increase the income of the 
labor force by giving it more capital equipment to increase its productivity. 
However, we will find in the following section that things need not always 
follow this straightforward pattern. 


5 There is some preliminary evidence that household savers do adjust their savings 
to offset business and governmental saving decisions. See Paul A. David and John L. 
Scadding, “Private Savings: Ultrarationality, Aggregation and ‘Denison’s Law'," 
Journal of Political Economy, Vol. 82, March, April 1974, pp. 225-50. 

© The discussion in this and the following sections relies heavily on the masterful 
exposition of the issues in Paul A. Samuelson, “A Summing Up," Quarterly Journal of 
Economics, Vol. 80, November 1966, pp. 568-83. 
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We deal with a type of production technology that is a modification of 
what has been described as the “point-input, point-output case." Labor 
is expended at one or several dates during the production of a commodity 
and then at some specific date the product is considered finished. In this 
model increasing capital intensiveness always takes the form of a longer 
(more time-consuming) process which permits labor to be saved. As an 
example we consider three processes for the production of a given quantity 
of firewood. In process I, one plants some fast-growing bushes and uses 
them one year later, with much labor needed to gather the required amount 
of wood. In process II, one plants two years before using the wood, ex- 
pending an intermediate amount of labor in watering and weeding one year 
before the wood is used. In process III, the wood is permitted to grow for 
three years with small amounts of labor devoted to its care during each of 
the years of its growth. The processes have been ordered so that each one 
is more capital intensive, or roundabout than its predecessor—each involves 
an earlier expenditure of preliminary labor than the one before it. The 
earlier outlays of labor are undertaken because they decrease the amount 
of labor needed to produce the desired output by an amount sufficient to 
compensate for the earlier outlay of work. Thus, with increased round- 
aboutness there will be a reduction in the total expenditure of labor per 
unit of output. Table 2 gives some illustrative labor-input requirement 
figures showing this relationship for output of firewood in year t + 3 via 
each of our three processes: 


TABLE 2 
— I$ —————————— ÁE— 


Outlays of Labor 


Year 
Capital 
Process Intensity t t+1 (+2 Total 
SS —— 
I low 0 0 2 12 
II intermediate 0 4 é 4 8 
II high 2 2 2 6 


We see that process I involves the outlay of 12 hours of labor in year 
t+ 2, the year before the wood is used, process II uses 4 man-hours in 
each of the two years before the firewood is obtained, etc. As we go from 
process I to II to III the technology becomes successively more “round- 
about" or capital intensive, with the total labor expended per unit of output 
declining, as shown by the successively decreasing numbers in the last 
column of the table. 


Part 4 Capital and Distribution Theory 659 


Which of the three processes in our example it will in fact pay to use 
(assuming that at least one of them yields a positive net benefit) depends 
on the discounted present values of the costs of the three processes and 
that, in turn, depends on the interest rate, 7. Using c; to represent the cost 
of process j and D once more as the discount factor D = 1/(1 + 4), those 
three discounted cost figures per unit of output are: 


cı = 04- 0D + 12D? = 12D? 
en — 0-- 4D -- 4D? — 4D -- 4D? 
«enr = 24-2D-4- 2D?. 


We will see now that as the rate of interest rises [the value of D = 1/(1 + 7) 
falls] less roundabout techniques will become cheaper. At low interest 
rates the most capital intensive technique, III, will be cheapest. When 2 is 
sufficiently high, intermediate technique II will become less costly than III. 
Finally, when is very high, the least roundabout technique I will become 
superior. 

To take two extreme examples, if the interest rate were zero so that 
D = 1/(1 +0) = 1 we have c; = 12, cn = 8, and cm = 6, and the very 
capital-intensive process III is clearly the most economical way to get 
things done.” On the other hand, suppose the rate of interest were as high 
as 100 per cent per month (i = 1), where this unrealistic figure is used 
for simplicity of calculation. Then D = 1/1 + 1 = $and D? = iso that 
we obtain cy = 32 = 3, eg = 2 +4 = 3, and cm =2+2+4 = 3.5. 
Thus, at this interest rate, processes I and II are equally expensive, while 
this time process III has become more expensive than either of the others. 
A repetition of: the calculation for the still-higher interest rate ¿ = 2 will 
readily show that now cr < en < Cnr. 

The reason for the observed relationship between the magnitude of the 
interest rate and the roundaboutness of the least costly technique is not 
difficult to see—heavy discounting (a high interest rate) always favors 
processes whose outlays occur later because they permit postponement of 
the dates at which one ties up resources whose opportunity cost is the high 
i per cent per period. 

Figure 2a shows how the present values of cr and ci behave as i in- 
creases, with the cost of the least capital-intensive technique, cr, starting 
off much higher than cn at ? = 0 but dropping far more sharply than the 
latter so that it catches up at the crossover point? A. Figure 2b also in- 


? This is the case in which there is no discounting so that an hour of future labor is 
precisely equivalent to an hour of current labor. Then it will then always pay to select. 
the process with the lowest total labor outlay, whatever its timing. 

8 We find this crossover point by solving for D or ? the equation ci = cr, that is, 
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cludes the curve depicting cm as a function of 7. Since the equation of that 
curve is em = 2/(1 + D? + 2/(1 + i) +2 we see that it starts off at 


cm = 6 when 7 = 0 and asymptotically approaches cm = 2 as 7 approaches 
infinity. 


Figure 2 


Several conclusions can be drawn about this sort of simple case (often 
referred to as the neoclassical parable)—the case in which increased 
roundaboutness consists of a spreading of the time interval over which 
labor is expended, along with a reduction in total labor outlay: 


PARABLE Property 1. (As Figure 2 has shown) increased interest 
rates will always lead to the adoption of less roundabout processes. ° 
This in turn implies 


12 4 $ 4 
a+) 1+i' (4)? 
which gives, multiplying through by (1 + i)?2, 


12D? = 4D+ 4D? or 


12=411+i)+4 or 12=84 4i, 
80 7 = 1 is our crossover point. Similarly, for the crossover point between, say cn and 
ont (point B in Figure 2b), we solve for i the equation ex = cur or 2+ 2/(1 + i) + 
2/01 + 2)? = 4/(1 + i) + 4/0 + i)? or 20 + 3)? + 20 + i) 4-2 = 4 + 2) + 4. This 
yields the quadratic equation in i, 2i? + 2; — 2 = 0, whose only positive root is 
approximately  — 0.6. 


? To show that this is always true in these neoclassical models, we can express the 
cost of the more roundabout of two processes (process r) as 


c, = Xa + = aot ad i-t a, + i 
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PARABLE Property 2. There must be diminishing marginal returns 
to the increased use of capital involved in more roundabout processes. 
For, by Proposition 1, in equilibrium the interest rate will equal the 
marginal yield of capital. Thus, we see that increased roundaboutness 
(added investment) will always be associated with a lower interest rate 
and, hence, a lower marginal investment yield. In terms of Figure 3a, the 
slope of the curve of marginal product of capital must be negative if a 
reduced interest rate is always to increase the equilibrium use of capital. 


PARABLE Property 3. Output per worker, y/k, will always ultimately 
fall as interest rates rise so that less roundabout methods are adopted, 
and less current consumption is sacrificed for future output. This follows 
directly from the premise that a decrease in roundaboutness involves a 
rise in the labor-output ratio, L/y. Thus, since with higher interest rates 
production becomes less roundabout, L/y will rise and hence y/L, output 
per person, will fall. In our example, as rising interest rates lead to & 
switch from most roundabout process III to process II and then to pro- 
cess I, output per labor-hour will decline from 2 to 4 to & of a ton of fire- 
wood. This is so, since by Table 2, the labor required per unit of output is 
6, 8, and 12 hours respectively for processes III, II, and 1.1? 


Figure. 3 


and that of a less roundabout process, s, yielding its output at the same time, as 
^ E 
C, = LG + if = b/ d D" 9o Due 
n 


eriod investment of labor in process 7 and b, is that in process s and 
where Sa, < X; b, It should be clear that at i = 0, c, = Le, and c, = È b, so 
that then c, will then always be less than c,. Therefore, at a zero interest rate the more 
roundabout process will be the cheaper. But as the interest rises, c, will increase relative 
to c,. Indeed, from the formulas it is obvious that as 7 approaches infinity, c, will ap- 
proach zero, while c, will approach ao, so that at a rate of interest sufficiently high, the 
less roundabout process must become less expensive in present value. 

10 After our discussion of the illegitimacy of adding together sums for different time 
periods, the reader may be uncomfortable about our adding of the labor spent in 


Here a, is the ith pi 
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PARABLE Property 4. In the neoclassical models output grows less 
and less capital intensive as the interest rate rises. For it can be shown 
that in any one given process, as the rate of interest rises, the ratio of the 
value of the total capital to the value of total output will fall.!! The 
capital-output ratio will also fall when, as a consequence of a rise in 
interest rate, one switches to less roundabout processes. Thus any par- 
ticular rise in interest rates will decrease the capital-output ratio whether 
or not it leads to a change in the process utilized. 


12. Reswitching of Techniques and Contradiction of the Parables 


We turn now to an illustrative pair of techniques in which, as we will 
see, the apparently normal behavior of the preceding cases breaks down 
and the four parable properties are no longer valid. That is, lower interest 
rates need not lead to more roundaboutness, a higher capital-output ratio, 
or increased output per man, and the marginal product of capital need 
not always decline. The examples we will now discuss involve the phe- 
nomenon of reswitching, which has been one of the foci of the debate 
between Cambridge (U.K.) and Cambridge (Mass.). It must, however, be 
noted that reswitching is not the only case in which such pathological 


different periods to calculate the total labor used in process III, for example, at 6 man 
hours per unit of output. But this can be interpreted in terms of the stationary state 
model of Section 4 to yield the same result. Consider three fields each planted with 
what will someday be a ton of firewood, but in one of them the trees are just one year 
old, in one they are two years old, and in one they are three years old. If each year the 
field with the three-year-old trees is harvested and replanted, the process becomes 
continuous, yielding one ton of firewood each year. Moreover, it requires an outlay 
each year of 2 man-hours of labor in each of the three fields. Thus 6 hours of labor will 
be spent each year and output will be one ton each year, giving us an annual output- 
labor ratio of 1 ton per man-hour with no problem about adding up inputs utilized at 
different dates. 

!! This can be verified by utilizing the measure of capital defined in Section 4. 
Equations (1) and (2), respectively, give the values of output and of capital in a process 
that lasts four years (the expressions can be generalized to an h-year process in an 
obvious way). The capital-output ratio, then, is obtained by dividing (2) by (1), yielding 


Capital _ zi(1 -- i)? + (z1 + z2)0 + i)? + (z1 + z2 + zn) + i) + (z1 + T2 + z3 + 24) 
Output aall + i)? + zo(1 + 2 + a3(1 + 2) + z4 
TA + i)? + oz» + i)? + zall + i) + za 
zi(. + 2)? + xa(l + 2)? + rall + i) + ra 
+ on + i)? + (G1 + z2)0 + i) + (e+ 22 om), 
ail + 2)? + xo(1 + 2)? + zall + i) + za 
where the next-to-last fraction is equal to unity, 


zero a8 i approaches infinity. Thus we confir 
described, the capital-output ratio will indee 


and the last fraction clearly approaches 
m that for a single process of the sort 
d decline when the rate of interest rises. 
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behavior can occur. There now seems to be a fair amount of agreement 
among the participants in the debate as to the immediate implications of 
reswitching and related behavior. Where the debate continues and opinions 
continue to be diametrically opposed is about the likelihood of their 
occurrence in practice and the seriousness of their implications for the 
neoclassical model. 

Reswitching refers to a case in which there are two (or more) techniques, 
call them A and B such that A is cheaper when the interest rate is very 
high, B becomes less expensive at an intermediate interest rate, but when 
the interest rate is very low, A again becomes the cheaper way of producing 
their product. Thus, as interest rate decreases the optimal technique 
switches from A to B and then reswitches back from B to A. Let us see 
how this ean occur. Suppose method A involves the planting of firewood 
two years before it will be used, with no further work done on it before its 
utilization time, while method B involves its planting three years before 
utilization and requires a second expenditure of effort in tending the trees 
the year before they are used. Table 3 gives illustrative figures for the 
input requirements of the two processes. +? 


TanLE 3. OvTLAYs or Lanon Per Unit or OUTPUT: 
RESWITCHING CASE 


RENE NNNM M a e 


3 Years Before 2 Years Before 1 Year Before 
Process Harvest Harvest Harvest Total 
nee ee uuo) 
A 0 7 0 7 
B 2 0 6 8 


It will be noted that there is no way we can say a priori that one of 
these processes is the more "roundabout," i.e., more time-consuming than 
the other. Process B expends labor both earlier and later than A, so there 
is no direct way of rating them in terms of their relative use of time. 

We can see why A will be favored both by a high and by a low interest 
* rate. A very low interest rate favors A because it expends less labor than 
B in total so that when there is zero discounting the total cost of A will be 
7 while that of B will be 8. Somewhat heavier discounting will, however, 
reduce most heavily the 6-labor-hour expenditure of B because it occurs 
two years after the beginning of the process and so is discounted at the 
rate D? = (1/1 + i)? while the 7-hour expenditure of process A is dis- 
counted only at the rate D = 1/1 + 7. Thus at some intermediate level of 
interest rates (to be specified presently), B will become less costly than A. 


12 The figures are taken from Samuelson, op. cil. 
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But, finally, when the rate of interest goes high enough, both the second- 
year expenditure of process A and the third-year expenditure of process B 
will be reduced to negligible present values. Only the first-year expenditure 
of B, which is totally undiscounted, will continue to be substantial so that 
B will again be the more expensive process. 

To take the extreme case, as į approaches infinity the present value of 
any postponed expenditure approaches zero since D = 1/(1 + 3) then 
approaches zero. The present value of the cost of process B will then 
approximate just the undiscounted 2 labor hours of the first year of that 
process, while the cost of A will approximate zero since its outlays are all 
discounted. 

We can verify all this using the cost formulas for the two processes, 


c, — 04- D-.7 --0 — 7/(1 4- d) 
6 — 2--04- D*-6—2-- 6/0 4- 3)?. 


Substituting in successively the values  — 0, i = 0.25, 4 = 0.5, etc., we 
obtain the present values of the costs of the projects, shown in Table 4, 


TABLE 4. Costs or PROJECTS A AND B AT SELECTED INTEREST RATES 


i 0 0.25 0.5 9.75 1.0 1.25 wee oo 
MM —  Á—À— ÀM— M 
Co 7 5.6 4.67 4 3.5 3.1 vias 0 
C, 8 5.84 4.67 3.96 3.5 3.19 ee 2 


————— ÁÉUHREEME 


which confirm the cost behavior that has been described. For 0 = 2 < 0.5, 
€, is less than cs. For 0.5 < i < 1.0, B is the less expensive technique, 
while for 1.0 < i, A is once again the more economical. Furthermore, we 
see that the crossover points occur precisely at ¢ = 0.5 and i = 1.0 at 
each of which!? c, = c. 


1? Clearly, reswitching requires that there be more than one crossover point. We 


n evant equation is of second degree and 
so it can yield two solutions for i; i.e., it can have two positive roots. To obtain thé 


crossover points we proceed as we did in the neoclassical case, solving the equation 
Ca = c or 7D = 2+ 6D?. We can obtain D directly from this quadratic equation, or 
we can substitute D = 1/(1 + i) to get 7/(1+ i) = 2+ 6/0 + 2)? or 2(1 + i)? — 
71+%)+6 =0 or 2i? — 3141 = 0, which, by the usual formula 
of a quadratic equation, has the two positive roots i = 0.5 and 7 
there are two crossover points, as reswitching requires. 

This result is generalized in the Appendix to this ch 
of the following chapter, the multiple roots of the cross 


for the solution 
= 1.0, and hence 


apter. As discussed in Section 5 
over formula and its reswitching 
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Thus, we see that, at least in principle, it is possible for some technique 
to be superior both when the interest rate is very high and when it is very 
low, though not at intermediate interest rates. But this case, as we will 
see now, violates each of the four properties of the neoclassical parable 
described in the preceding section: 


Violation of Parable Property 1. Obviously, where reswitching occurs, 
decreased interest rates do not always replace one technique successively 
by another, each of which is more roundabout than its predecessor. This 
no longer is true, first because reswitchable techniques, as we have seen, 
cannot be classed unambiguously as more or less roundabout! More im- 
portant, if interest rates fall sufficiently, reswitching will bring back 
techniques which had also been favored at high interest rates. There is no 
longer a unique ranking of techniques as there was in the neoclassical 
models in terms of the order in which they are favored by successive 


reduction in interest rates. 


Violation of Parable Property 2. In the neoclassical tale, there must be 
diminishing returns to increased capital use. That is precisely why de- 
creasing interest rates will always favor the adoption of more capital- 
intensive (“roundabout”) processes. That is, as shown in Figure 3a, the 
marginal product of capital (MPC) must fall monotonically (holding the 
quantity of labor constant), if a fall in interest rate (from i, to is) is always 
to induce an increase in use of capital (from K, to K.). But the reswitching 
case has shown us that a sufficient reduction in interest rates can get 
society back to exactly the same technique which it would have used at 
a high interest rate and hence exactly the same use of capital. That is, the 
MPC curve must involve exactly the same quantity of capital’* at interest 
rate i, and z, (Figure 3b) so that it may be “backward bending" as shown 
in the figure, or it may be C-shaped, or of some more complex shape but 
it cannot be steadily downward sloping as in Figure 3a. That is, the re- 
switching case contradicts the claim that marginal returns to capital must 


always diminish. 
Violation of Parable Property 3. In the neoclassical case it was claimed 


that a rise in interest rate would always ultimately cause a fall in output 
per worker because it would reduce the utilization of capital. While a rise 


implications are phenomena which have been recognized for some time for the invest- 
ment plans of a single firm. However, the recent debate seems to have been the first 
time they were considered for an entire economy. 

14 Here we mean the same physical quantity of capital. The market price of the 
capital must change as the interest rate changes and varies the value of D at which one 
discounts the outlays that create the capital. 
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in interest rate will allegedly permit a temporary increase in consumption 
in the short run because it involves less investment for the future, less 
output per worker must be the ultimate consequence. However, the re- 
switching case again shows that this need not be true, because, with both 
a high and a low interest rate, society will use the same technique with the 
same present and future consumption and output levels. In fact, it is not 
difficult to show that as interest rates rise and society consequently goes 
from technique A to B and then from B back again to A that at some point 
the rise in interest rate can actually produce a decrease in current con- 
sumption and an increase in future output per worker—the very opposite 
of what had usually been assumed.! 5 


Violation of Parable Property 4. Similarly, the reswitching case violates 
the conclusion that had previously been drawn from the neoclassical cases 
to the effect that a rising interest rate will always decrease the capital- 
output ratio. Roughly speaking,! it shows that at both a bigh and a low 
interest rate we may have the same process yielding the same output with 
the same amount of capital. 


These, then, at least illustrate the main conclusions that have emerged 
from the reswitching discussion. As already indicated, all parties to the 
debate seem to agree about the validity of these results. Where the discus- 
sion rages hot and heavy is on the likelihood of the occurrence of the re- 
switching phenomenon in practice and the seriousness of its damage to the 
neoclassical position. 

With respect to the first of these issues the Anglo-Italian group main- 
tains that their discussion has shown the neoclassical model to represent 
“an entirely isolated case,’’!7 while the partisans of M.I.T. have suggested 
that, for an economy as a whole, Switching may be extremely rare and 
perhaps even nonexistent because “the conditions under which it can be 
ruled out are very weak."!8 Moreover, the Anglo-Italians suggest that 
"the implications of the phenomenon of reswitehing of techniques for 
marginal capital theory appear to be more serious the deeper one goes in 


15 See Samuelson, op. cit., p. 579. 


16 The argument is not quite right since the money value of both.capital and output 
and, therefore, the capital-output ratio will be affected by the change in the value of D 
resulting from the variation in i. 

MARE. DL. Pasinetti, "Switches in Technique and the ‘Rate of Return’ in Capital 
Theory," Economic Journal, Vol. 79, 1969, pp. 508-31. . 

18 J. E. Stiglitz, “The Cambridge-Cambridge Controversy in the Theory of Capital: 
A View From New Haven," Journal of Political Economy, Vol. 82, July-August 1974, 
p. 897. The reference to New Haven should not be taken to be a claim to impartiality. 
Professor Stiglitz, who has, in any event, since moved elsewhere, makes no bones about 


his adherence to the M.I.T. views of the matter and expresses them with great lucidity. 
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uncovering them” (Pasinetti, op. cit., p. 529), while the M.I.T. view is 
that “. . . reswitching has no implications for the validity of neoclassical 
distribution theory" (Stiglitz, op. cit., p. 897). 


13. Current Neoclassical Theory: The Factor Price Frontier and Labor's Share 


The current neoclassical position on distribution theory can roughly be 
summed up in four assertions: 


a. That the analysis of the distribution of income must ultimately 
be based on a full model of general equilibrium and its price deter- 
mination. process, as was discussed in Section 5 of this chapter. 

b. That while there are dangers in the use of models based on 
aggregative production functions, with a single variable representing 
aggregate capital, such aggregation is quite unnecessary for a full 
general equilibrium analysis. Moreover, it is believed “. . . that, under 
most circumstances and for most problems, the errors introduced as 
a consequence of aggregation of the kind involved in standard macro 
analysis are not too important” (Stiglitz, op. cit., p. 899). Thus, it is 
reasonable to build macro models for theoretical analysis or econo- 
metric estimation using a single figure for the aggregate quantity 
of capital which is as defensible as any other index number construct. _ 

c. Though the reswitching phenomenon “causes headaches for — ^ 
those nostalgie for the old time parables of neoclassical writing, we 
must remind ourselves that scholars are not born to live an easy 
amuelson, op. cit., p. 583). In other words, it is not so 
ed to draw general qualitative 
though the 


existence” (S : 
easy as had previously been imagin 
conclusions from the neoclassical distribution theory, 


structure of that theory remains unaffected. ' ; 
d. The theory does have an instrument of analysis from which 


one can draw a few general implications about the share of wages 
and capital in the national income, presumably the main issue to 
which distribution theory addresses itself. This analytic instrument 


is the factor price frontier. 

er shows all the combinations of real wage rates 
de possible by any given technique or combina- 
a AA’ is such a curve for some single process. 
It shows, for example, that if this process were the only one employed 
and the rate of interest were zero, then the competitive wage rate would 
be w*, but it indicates also that as the rate of interest rises the real wage 
rate will decrease steadily from that maximum figure. Such a factor price 
frontier is deduced from the zero-profit conditions of competitive equilib- 


The factor price fronti 
and real interest rates ma 
tion of techniques. In Figure 4 
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Figure 4 


rium, which are critical components of the price determination process 
and which indicate, at each level of interest rate, for each technique, the 
maximum amount that a competitive firm can afford to pay its workers 
without incurring a loss. 

Figure 4b shows what happens to the factor price frontier when, as is 
true in reality, there are a number of production techniques available. 
Here we see superimposed the factor price frontiers for three techniques 7, 
s, and v. Technique r has the factor price frontier AA’ of Figure 4a. But 
the frontier of technique s crosses AA’ at point B". To the right of that 
point (i.e., at interest rates higher than a) by using technique s firms can 
afford to pay wages higher than those indicated by AA’. Competition by 
firms for the available labor will therefore force them to adopt technique s 
and pay the higher wage. Thus for the pertinent range of interest rates the 
social factor price frontier will be B" D". Similarly, to the right of D", 
competition will force the adoption of technique u, and there the frontier 
will be D"D'. In sum, the social frontier will be AB" D" D', the upper 
segments of the frontiers for the individual productive techniques. 

Figure 4c shows us the same relationships for the reswitching case ot 
the preceding section. Here we see that for 0.5 < i < 1.0 technique b 
permits higher wages to be paid, and it will therefore be used. However, 
for interest rates below or above this range technique a will be utilized, 
just as our previous discussion showed. Here, then, the social factor price 
frontier will again be composed of the upper segments of the frontiers for 
the individual techniques; in this case it will be EFGH. 

From the frontier it is possible to deduce for any level of the interest 
rate not only the real wage rate, w, but also the share of wages in total 
output, wL/y, where L is the size of the labor force and y is the value of 
total output. By subtraction, one also obtains, residually, the share of 
earnings of other types of income as (y — wL)/y. 
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. One of the main conclusions drawn about the factor price frontier is 
its negative slope—the notion that lower interest returns to capital in a 
competitive economy will result in higher wage rates. Indeed, Samuelson 
concludes “. . . the neoclassical parable remains valid as far as the factor- 
price frontier trade-off between real wage and profit rate is concerned. 
But that is all that remains valid regardless of reswitching."!? 


14. Some Last Comments on the Debate 


For an uninvolved observer?? it is easy to come away seeing consider- 
able merits in both positions, more than the partisans in some of their 
more acrimonious pronouncements are willing to concede. First, it seems to 
me that with all its imperfections, and all the caveats with which one must 
circumscribe its conclusions, the general equilibrium model o^ neoclassical 
form is still the only complete analytical game in town. The alternative 
distribution models that have so far been proposed, while sometimes highly 
suggestive, are nevertheless rather ad hoc, are incomplete, and suffer from 
the same sorts of problems of aggregation as any neoclassical macro models. 

Yet the debate seems to have revealed more clearly two serious de- 
ficiencies of the neoclassical model—first, it has shown the very limited 
ve from it that relate to policy and other forms of 
can we be certain that the interest rate bears a 
onship to the capital-output ratio, to the produc- 
tivity of labor and, hence, to workers’ living standards. Second, it has 
reemphasized the absence of historical, institutional, or sociological content 
in the theory. Surely, these considerations do have a good deal to say 
about the process of distribution. The economic position of the worker in 
Western Europe and the United States is very much different from what 
it was before World War I, and in ways about which neoclassical theory 
by itself can tell us very little. There is an even more fundamental difference 
between the position of a workman in the late middle ages and his place 
in the economy after the Industrial Revolution. A theory of distribution 
which ignores such fundamental relationships is, indeed, a performance of 
Hamlet in which the prince of Denmark does not appear. 


ould be noted that Nuti has argued that even this conclusion 
that the factor price frontier can have positively sloping 
“Capitalism, Socialism and Steady Growth," Economic 


conclusions we can deri 
application. No longer 
simple and reliable relati 


19 Op. cit., p. 575. It shi 
can have its exceptions—' 
segments. See D. M. Nuti, 
Journal, Vol. 80, 1970, pP- 32-54. 

20 Since this section represents & personal view of the matter I should hasten to add 
that I have (or at least until now have had) friends on both sides of the discussion. I 


have written this section fully aware of the folly of trying to pass judgment on a debate 


that has not yet run its course. 
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But to make progress in the directions that are suggested here it is not 
necessary to throw away what has already been achieved. It seems to me 
that there is little to be gained by launching further attacks on the general 
equilibrium model. It does work, it does have valuable uses, and we have 
no substitutes for it. Rather, the urgent task is to extend the analysis or 
to produce supplementary or even alternative analyses which do give us 
insights into such issues as distribution and poverty, distribution and 
income inequality, and distribution and historical developments. 
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APPENDIX: COMMENTS ON THE MATHEMATICS OF RESWITCHING 


and to these we apply the discount factor D?. If, for example, c, had Kn 
involved the expenditure of, say, 5 man-hours, three periods in th 


e future, 
we would have c, = 7D + 5D?, and the crossover equation c, = 


£y would 
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be of third degree with the possibility of re-reswitching (three positive roots). 
That is, there might now be three discount rates at which a pair of tech- 
niques is equally costly. More generally, where at least one candidate 
technique involves expenditures n periods in the future, the crossover 
equation will be of nth degree with the possibility of as many as n positive 
solutions for D, and, at least in principle, of n switches of techniques. 
Comment b. Why no reswitching in the higher degree equations of the 
neoclassical parable? Even in the neoclassical illustrations of Section 11 
we had quadratic crossover equations such as cr = cnr, i.e., 4D + 4D? = 
2 -+ 2D + 2D? and here, too, longer processes will introduce equations of 
higher degree. Why do these not involve two or more roots and hence, the 
possibility of reswitching? The answer is that such equations happen to 
have only one positive root (for example, the equation cn = cm can be 
rewritten, by collecting terms, as D? + D=1 whose solutions are 
—0.5 + v1.25). There is a mathematical theorem about the signs of the 
coefficients of an equation of the nth degree (Descartes’s rule of signs) 
which tells us that if such an equation is the form anD” + a, 1D" + 
E a, DË = ap- D +--+ a4D + ag with all the constants a; pos- 
itive, then it will always have exactly one positive root. This result is 
almost obvious because at D = 0 the left-hand side (LHS) of the preceding 
equation is zero and therefore less than ao, the value of the RHS when 
D = 0. But because the LHS has the higher powers of D it will eventually 
catch up with the RHS at some sufficiently large value of D, call it DY 
and then remain greater than the RHS for all D > DE Thus, D* is the 
only positive root of D. We show next that our neoclassical examples 
always satisfy Descartes’s rule. For these models involve choices among 


techniques whose costs are of the form 
Cw = w w1D-4- HWD” and c, = vD*4- v41 DH 4 H nD" 


and the former is clearly the more round- 


wh : ;for each j > k 
ere w; < vj fo is then of the form 


about process. The crossover equation Cw = Cy 


DnD” 4o, 1D"! H DE = WD” + w, aD He wiD 4- wo 


or 
^ n—1 — — 4b ) D* 
v, — Wa)D" Oi — wa-1)D" + + (v = Uk 
; = wy 1D*1+---+wiD+ wo, 
which is precisely the sort of equation that we know from Descartes’s rule 
of signs to have only one positive root since every v; > Wj SO that every 


coefficient is positive. 


Answers to Problems 


Chapter Il, Section 6 


2 
l. (a) y= È ast (b) y i 


2. (a) aoz? + aia? + azt + a3 (b) 12 4- 22+ 3? = 14. 


Chapier IV, Section 2 


l. dy/dx = i729 — 32a. 

2. dy/dx = —48x!! — 8 sin 4z. 

3. dy/dr = —42z77. 

4. dy/dx = 3e3* sin z + e3? cos T. 

5. dy/dx = (3e3* sin z — e= cos z)/ (sin z)?. 
6. dy/dz = (3e? log £ — e37/2)/ (log z)?. 

7. dy/dz = 602? cos 52*. 

8. dy/dz = —32x-3e27 7". 


Chapter IV, Section 5 


1. (a) 6023 (b) — 2/2”. 
2. Maximum at z — 9 (b) Minimum at z — 0. 
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674 Answers to Problems 
Chapter IV, Section 6 


1Q=2,A =3. 
2.2 = —4,2 = 3. 


Chapter IV, Section 7 


(a) dy = 4x23 dz, + 62322 dzz j 
(b) dy = 4z1 dzı + 1222 dz; 
(c) dy = (Of/dx1) dzı + (2z2 + ðf/ðz2) dz». 


Chapter IV, Section 8 


1l.. (a) w = 5,2 = 7,4 = —50. (b) Q = 0.5, A = 1, à = —6.5. 
2. yx = log z?w + Ai (cos z cos w — 0.3) + A(z/w5 + ev — 10). 


Chapter IV, Section 9 


2. The consumer maximizes U(Q) — PQ. Setting the first derivative equal to 
zero we have dU/dQ — P = 0 or P = dU/dQ. 
3. From PQ — K wehaveQ — K/P and so dQ/dP = —K/P?. Substitute these’ 


values of Q and dP/dQ into the elasticity expression and the result follows at 
once. 


4. Using the rule for differentiation of a product, 


” dC dQ dc dQ de 
marginal t = — = — = Q—+4+ ce = ES o; 
arginal cos dQ ^ àQ Q dQ dQ Q dg 
But when c attains its minimum, dc/dQ = 0. 
5. (a) 25 (b) no (c) two:Qi = 5, Q2 = 20 
(d) second derivative of profit: +0.9 at Q = 5 and —0.9 at Q = 20. 


Chapter V, Section 10 


1. R = 27,2 = 0, y = 3, z = 1, all slacks zero. f 

2. R = 178,2 = 35 y = $, s3 = 13, 81 = s2 = 0 where s; is the slack variable 
of the first constraint, etc. 

3. R = 26,2 = 2, y = 3, s2 = 1, 8) = 83 = 0. 

4. R = 53,0 = 13, sı = 33, s2 = y = 


I 
e 


Answers to Problems 


Chapter VI, Section 1 


Minimize a = 5V1 + 7V2 + 3V3 


subject to 
4Vi+ 3V2+1V3 > 6 


1V,4- 2V2+1V3 > 2 
Vı > 0, V2 = 0, Ys > 0. 


Chapter VI, Section 5 


l. a = 7,500, Va = Li = 0, Vs = $, L2 = —3 

8. (a) II = 300, U2 = Qi = U1 = 0, Q2 = 9, Us = 7; 
3. (b) æ = 300, Le = Va = 0, V2 = 2, Lı = 4, Vi = 1 
3 
3 


. (e) H =a = 300 
. (D QıLı = (0)(4) = 0, Q2L2 = (9)(0) = 0, UY1 = (0)(1) = 0, ete. 


4. Because UiVi »é 0. 


Chapter Vill, Section 4 


1. LQ, V) = 792 — 20102 
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+ Q3 + Vi(400 — Qı — Q2) + V»(Q1Q» — 200), 


8L/aQ1 = 14Q1 — 20» — Vit V2Q2 < 0, 0L/0Q» = —2Q1+ 302 — Vat 


VoQi < 0,Q1 9L/0Q1 = Q1(14Q1 — 20s — Vit 
Q»(—2Q1 + 308 — Vi + V2 
àL/9Vs = Q1Q2 — 200 > 0, Vi 
Vs ðL/ð3V2 = V2(QiQo — 200) = 0. 

2. L(Q, V) = 702 — 20102 + Q3 + Vi(Q1 + Q2 
9L/9Qi = 14Q1 — 2Q2 + Vi — V2Q2 
V2Q1 > 0, 9L/0V1 = Qı + Q? — 400 < 0, etc. 

3. L(Q, V) = 60103 4- V:(50 — 207 
4QıVı+ V2 2 0, AL /8Q2 = 120102 — Vi 
0, AL/aV1 = 50 — 20? — Q2 S 
V2 ðL/ðV2 = 0. 


Chapter IX, Section 20 


l. Zapa = $10. 
2. Lapa = $123. 
8. ra = (99pa + 19)/(pa + 48)- 


— Qa) + Vo(Qi — 10), 0L/0Q1 = 6Q 
> 0, Q1 ðL/3Qı = 0, Q2 9L/0Q» = 
< 0, 0L/0Va = Qi — 10 < 0, V1 OL/0V1 = 0, 


VaQo) = 0, Q2 90L/0Q» = 
)-20,90L/90Vi. = 400 — Qi — Q» = 0, 
aL/aV1 = Vi(400 — Qi — Q2 = 0, 


— 400) + V2(200 — Q192), 
> 0,0L/0Q» = —201 + 303+ Vi — 


2— 
2 


676 Answers to Problems 
Chapter XI, Section 16 


1. (a) K = M/9 L=4M/9 » = 6M-4+. 
l.(DK-4P? L=16P2 A-C-—p, Q=72P. 


2. (a) K = (54 + 2)/16, L = (3M — 2)/8, ^ = (—7M+34)/8. 
2. (b) K = (23P — 5)/14P, L = (11P — 3)/7P, = —P, 
y = (7AP? — 4)/7P2 


Chapter Xil, Section 5 


1. (a) 500 square feet (b) 2,000 square feet. ] 
2. Any point on OP involves ten hours of vat time per hour of labor time, 


3. Your line should go through the point representing 7,000 vat hours and 300 
labor-hours. OP4 lies above ray OP3 except at the origin. 


Chapter XII, Section 7 


1. 1,500 units of output via process 1 requires 600 labor hours and 3,000 vat- 
gallon hours. Five-hundred units of output via process 3 requires 175 hours of 
labor time and 1,750 gallon-hours of vat time. These add up to the coordinates 
of point G. 

2. The coordinates of E4 are 300 labor hours and 7,000 vat-gallon hours. The 
corresponding coordinates of F4 are 600 and 14,000. 

5. Both segments have slopes of —10. 


Chapter XII, Section 9 
Qı = 1364, Q2=0, Qs = 182, profit = $1,427 
Chapter XII, Section 10 


At point 0 there are unused amounts of both inputs, so both slack variables 
are nonzero. At point S’ only process P’ is used (so Q’ > 0) and some X is 
unused so its slack variable is nonzero. At point B we have no unused inputs 
but Q > 0 and Q’ > 0, etc. 


Chepter XV, Section 10 


1. (a) Qn = 4, Py = 10.4, In = 11, Ry = 41.6 
0) =, Pe=6  IL-—H0 = 99 
(c) Q. = 5, P. = 10, IL = 10, R, = 50 
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2. (a) Qn = 6, Py = 14, Ig = I7, R= 84 
(b Q, = 8, P, = 11, II, = 13, R, = 88 
() @=7, Pe=12%, Te = 16, R, = 87. 

dPQ P dQ\ 1 P dQ 

3.8 =S*—= dQ 1 pO -—— 

dP PQ (oe Ps “Gap t 

dPQ aP dP Q 1 
4. MR = = P+ — — >j = RON. 1D 
p ^ta" ur e ) 


Chapter XVI, Section 13 


1. Let P = a — bQ, average cost = c -+ kQ, so that C = cQ -+ kQ?. Then II = 
PQ—C = (a—oQ — (6+ k)Q?. Thus II = 0 when Q = 0 or when 


a—c 
uini 7773 
Also, setting dII/dQ — 0 we get for profit maximization, 
a—c 
975045. 


Q1 = 4, Q2 = 4, Pı = 9, Po = 13, MR = 1, 
MRe = 1,I = 78. 


2. (a) discrimination: 


nondiscrimination: Q1 = 3.2, Q2 4.8, P1 = P2 = 10.6, ME = 4.2, 
MR = —3.8, I = 74.8. 
(b) discrimination: Qi = 0.5, Q» = 0.67, P1 = 1.5, P2 = 5.0, ME; = 1, 
MR = 1,1 = 2H. 
nondiscrimination: Qi = 0, Q2 = 14, Pi = P2 = 2, MRi = 2, 
MRa = —5, 0 = 1.5. 


3. To maximize II = P1Q1 + PoQe — C(Q1, Q2) we require 
_ 9P1Q1 _ 9c un = 2 30. 


MRi = "59, £d aQ 902 
Hence, if 9C/0Q1 = 0C/0Qs, MRi = M R2. 
= = = = 9,1 = 15. 
4. (a) Cournot: Qi = 4, Q2 = 5, Th 6, I2 = 9, II 
Joint maximum: Q1 = 2,Q2 = 4, 0h = 4, Ile = 16, II = 20. 


š = = 15, Io = 6, I = 21. 
(b) Cournot: Qi = 3, Q2 = 3, Ili , " 
Joint maximum: Q1 — 23, Q2 = 2%, Ih = 158, II; = 6, I = 21$. 


Chapter XXV, Section 6 
l. (a) 1/1.05 = 0.952 approximate 


1. (b) 0.935 approximate : 
2. D = 0.95 approximate, so present value = 500-+ 700D + 200, D* = 500 + 
700(0.95) + 200(0.91) = 1347 = a/(1 — D) = a/(0.05), or a = 67, approx- 


imate. 


678 Answers to Problems 


3. Project A Project B Project C 
Payout period 
Marginal efficiency 
Discounted present value 


$ year 14 year 1 year 


100% 534% 663% 
$246 $214 $250 


Chapter XXV, Section 7 
Max 702; + 20z2 + 60z3 + 3024 -+ 10z5 

subject to 

2021 + 7z2 + 15z3 + 8z4 -+ 225 < 40 

10z1 + 8z2 + 20z3 + 5z4 -+ 3z5 < 30 

zı < 1, z3 < 1, z5 < 1 
z2+ z4 X1 
2120, 2220, z320, z4>0, z5 2 0, 


all z; integer. 


Chapter XXV, Section 8 


Present value of A = $416 (approximate) 
Present value of B = $428 (approximate) 
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in ar model, 634— 
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independent, 394 
logic of, 21-22 
Decision theory, 458-76 
axiomization in, 470-71 
Neumann-Morgenstern utility and 
Bayes criterion, 471-75 
foundations of statistics and, 475-76 
investment risk and, 624-25 
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compensated demand function in, 
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Duality (cont.): 
expenditure function in 
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utility function, 356-60 
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decentralized decision-making, 
117-18 
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Index 683 


Equilibrium (cont.): 
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Expansion path, 279-80, 294 
Expected payoff, 426-28 
Expected utility, 426-28 
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of origin, 365 

slack variables and, 81-84 

of solutions, 79, 100 

in dual problems, 113 
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Finite-horizon method, 619-20 


684 Index 


Firms 
competitive, 395-98 
pure competition, 393 
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for n-person games, 452-57 
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solution, 455-57 
strategies in 
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450 
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mixed, 444-50 
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utility theory and, 420-21 
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General equilibrium 
activity analysis and, 549-68 
dual prices and decentralized 
decision making, 565-66 
existence problem, 549-54 
integer programming, 566-68 
uniqueness problem, 549-51, 554— 
57 
von Neumann model of expanding 
economy, 557-59 
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537 
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theory of money and, 479-94 
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87 
equations, 480-83 
interdependence, in 
479-80 
optimal cash balances, 490-94 
real balance effect, 487-88 
welfare economics and, 496-534 
beneficial and detrimental exter- 
nalities of production and con- 
sumption, 517-21 
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maximands, 497—500 


Pareto optimality and productive 
efficiency, 501-10 


pure competition and monopoly, 
511-12 


resource allocation, 496-97 


economy, 


General equilibrium (cont.): 
theorem on democratic group de- 
cisions, 531-34 
General marginal and average curves, 
31 
Geometric interpretation, 49-51 
Geometric representation, 141-44 
Geometry 
of equilibrium points, 442-43 
of linear programming, 77-81 
Global maximum, 38-40, 56 
Goods 
complimentary, 479-80 
distribution of, 503-8 
inferior, 207 
public, 521-22 
substitute, 479-80 
Graaf, J. de, 511n 
Gradients, 153 
Great Depression, 526 
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Group decisions, democratic, 531-34 
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Hawkins-Simon conditions, 544-46 
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of capital, 641-42 
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Hicks, J. R., 211, 339, 528n, 578 
High probability of success, 429-30 
Hirschleifer, Jack, 610n, 651n 
Historical sunk costs, 598 
Hitch, Charles J., 412n, 520n 
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of capital, 641-42 
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Identification, 238-45 
theorems of, 247-52 
Identity signs, 482 
Ignorance, zone of, 349 
Imperfections of market, 631-32 
Imputation, 454 
Income 
changes in price and, 206-9 
of consumers, 202-4 
real, 350-53 
retention of, 630-31 
Ineome-consumption curve, 207, 208 
Income effect 
role of, 211-12 
substitution effects and, 209-11 
Increasing costs, 592-95 
Increasing returns to scale, 273 
Independence, 429 
Independent decision-making by firms, 
394 
Index 
summation, 19 
utility theory, 432-35 
construction of, 424-26 
Index numbers of real income, 350-53 
Indifference curve 
consumer, 278 
production, 303-9 
profit, 309-11 
Indifference map, 193-206, 348-50 
equilibrium of consumer and, 204-6 
price lines and, 202-4 
properties of, 195-98 
statiation and lexicographical order- 
ings and, 198-201 
Indivisibilities, 274, 613-18 
Industry, competitive, 398-400 
Inessential games, 454 
Inferior goods, 207 
Information 
for quasi-collusion, 451-52 
threat, 451 
Input-output analysis, 537-48 
dynamized model for, 541-48 
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Input-output analysis (cont.): 
economic problem and assumptions 
of, 537-39 
mathematics of, 539-41 
theorems of, 543-48 
Hawkins-Simon conditions, 544— 
46 
Samuelson substitution theorem, 
543-44 
series approximation of solution, 
546-48 
Input supply curves, backward-rising, 
586-89 
Inputs, 267-68 
average, 315-17 
combinations of output and, 386-88 
dated, 643-46 à 
in fixed supply, 591-92 
heterogeneous, 592-95 
limited quantities for, 273 
marginal, 315-17 
multiple outputs ana, 382-83 
relative levels of, 269-70 
required by outputs, 365 
response to relative prices for, 287— 
89 
total, 315-17 
Institutional determinants of saving, 
656-57 
Instrumental variables, 262-64 
Integer programming, 566-68 
Integrability, 346n 
Interdependence, 613-18 
in economy, 479-80 
in oligopoly, 409-12 
Interest rates 
cost of capital and, 633 
diminishing returns and, 654-56 
institutional and sociological deter- 
minants of saving and, 656-57 
monetary theory of, 653-54 
neoclassical investment models and, 
657-62 
producers’ demand for investment, 
and, 648-50 
reswitching techniques and, 662-67 
supply of saving and, 650-53 
Interior maximum, 39 
Interior points on line segment, 216-18 
Internal economies, 517 
Internal rate of return, 605 


Interview approaches to demand de- 
termination, 228-30 
Inventory levels, 4 
determination of cost relationship in, 
7-8 
optimality analysis of, 5-7 
optimality calculation for, 8-10 
Investment, 597-99 
discounting and 
opportunity costs, 601-3 
present value, 599-601 
financing of 
convertible securities, 628 
direct loans, 628 
issuing new shares of common 
stock, 625-26 
optimal financial policy, 628-32 
plowback, 625 
marginal efficiency of, 604-13 
discounted present value vs., 605- 
13 
neoclassical models of, 657-62 
payout period for, 603-4 
producer’s demand for, 648-50 
risk and, 619-25 
decision theory and Neumann- 
Morgenstern utility, 624-25 
discounting for, 620-21 
finite-horizon method, 619-20 
probability theory approach, 621- 
23 


sensitivity analysis and, 624 
See also Capital 
Iso-product curves, 276-78 
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Kemeny, John G., 558 


e John Maynard, 490, 585, 589, 
4 


Keynesian liquidity preference, 653 
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Koopmans, T. C., 72n, 250n 

Kuhn, Harold W., 72n, 105n, 156, 430 
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Labor 
saving and, 586-89 
share of 
constancy of, 576-77 
factor price frontier and, 667-69 
unionization of, 589-91 
Lagrange multipliers, 62-67 
Lagrangian expression, 157,159-62 
Land, fixed supply of, 591-92 
Lange, Oskar, 483n 
Laplace criterion, 462-63 
La Salle, Ferdinand, 583n 
Least squares bias, 245-47 
Lemke, E. C., 122 
Lending, 650-53 
Leontief, Wassily W., 370, 372, 373, 
537, 539 
Leverage, effects of, 628-30 
Lexicographical orderings, 198-201 
Limited input quantities, 273 
Limited variable range problems, 52-54 
Linder theorem, 339-42 
Linear equations, 14-17 
Linear programming, 72-103 
algebra and geometry in, 77-81 
basic solutions in 
initial, feasibility and optimality 
criteria for, 87-90 
basic solutions in 
pivoting process, 90-99 
slack variables and feasibility 
solutions, 81-84 
basic theorem of, 149-52 
simplex method and, 84-86 
in cases where origin is not feasible 
solution, 100 
characteristics of, 74-77 
duality in, 105-38 
decentralized decision-making 
and, 117-18 
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Linear programming (cont.): 
economic interpretation of, 109-12 
simplex method and, 124-26 
solution of prinial and dual pro- 
grams, 118-24 
theorems of, 113-17, 127-33 
production theory and, 297-317 
alternative diagram, 299-300 
alternative types of solutions, 313- 
15 
feasible region, 300-1 
graphic solution, 311-13 
marginal, total and average input 
products, 315-17 
production indifference curves, 
303-9 
profit indifference curves, 309-11 
representation of process, 301-3 
standard problems of, 72-74 
Linearly homogeneous production 
function, 273, 281 
Lintner, John, 623n 
Little, I. M. D., 528n 
Littlechild, S. C., 175n 
Loans, direct, 628 
Local maximum, 38-40, 56 
Logarithms, 18-19 
Logic of decision-making, 21-22 
Long run, definition of, 290 
Long-run cost curves, 291-92 
Lorie, J. H., 606n 
Luce, R. Duncan, 430, 437n, 458n 


McFadden, Daniel, 359n, 364n,.365n 
McKean, Roland, 520n 
Makower, Helen, 615n 
Malthus, Thomas, 580, 581n 
Many-variable relationships, 57-60 
March, J. G., 390 
Marginal analysis, 21-41 
arithmetic relationships in, 23-31 
marginal and average curves, 27- 
31 
total z curves, 25-27 
average figures in business practice 
vs., 34-35 
averages as approximations of 
marginal figures, 35-37 
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Marginal analysis (cont.): 
differential calculus and, 42-46 
economic applications, 67-70 
geometric interpretation, 49-51 
differential calculus and 
nondifferentiability and limited 
variable range problems, 52-54 
rule of differentiation, 47-49 
fixed costs and, 31-34 
logic of decision-making and, 21-22 
theorems on resource allocation, 
22-23 
second-order optimality conditions 
in, 37-41 
global, local and corner maxima, 
38-40 


Stability and, 40-41 
Marginal cost pricing, 512-16 
Marginal efficiency of investment, 604— 
13 


Marginal inputs, 315-17 
Marginal productivity, 573-77 
Marginal rate of substitution, 197 
Marginal rule for output levels, 508-10 
Marginal utility, diminishing, 191 
Marginal z curves, 27-31 
“Market failure," 521-22 
Market imperfections, 631-32 
Market structures, classification of, 
393-95 
Markowitz calculation, 623, 624 
Marshall, Alfred, 517, 653-54 
Marx, Karl, 572, 583n 
Massachusetts Institute of Technol- 
ogy, 666 
Maximands, 497-500 
Maximax criterion, 461 
Maximin criterion, 460-61 
Maximin strategy, 439-40 
protective power of, 440-41 
Maximization 
comparative statics and, 319-42 
constrained, 62-67 
joint, 417-48 
in many-variable relationships, 57— 
60 


of profits, 379-81, 395-98 
sample calculations, 390-91 
of sales, 383-85, 390-91 
second-order conditions of minimiza- 
tion and, 54-57 


Maximum conditions, Kuhn-Tucker, 
163-64 

Maximum likelihood method, 254-57 
Measures 

associative, 421 

cardinal, 422-24 

of elasticity, 186-90 

orderings or rankings as, 422 

of quantity of capital, 642-46 
Miller, M. H., 628, 630 

See also Modigliani-Miller model 
Minimax regret criterion, 463-64 
Minimax strategy, 439-40 

equilibrium points and, 441 

second-order conditions of, 54-57 
Mixed strategies, 464-66 

in game theory, 444-50 
Modigliani, Franco, 628, 630 
Modigliani-Miller model, 634-35 
Monetary interest theory, 653-54 
Monetary units, 498 
Money, theory of, 479-94 

determination of price levels in, 483- 

87 


comparative statics in, 488-90 
equations in, 480-83 
interdependence in economy and, 
479-80 F 
optimal cash balances and, 490-94 
real balance effect in, 487-88 
Monopolistic competition with product 
differentiation, 394, 402-4 
Monopoly 
bilateral, 394, 407-9 
discriminating, 394, 405-6 
elementary mathematical analysis 
of, 416-18 
pure, 394, 401-2 
unions as, 589-91 
welfare economies and, 511-12 
Monopsony, 394, 404-5 
Monotonic transformations, 214-16 
Morgenstern, Oskar, 421, 437n, 455, 
558n 
Mossin, Jan, 623n 
Multiple products, 382-83 
firms producing, 275-76 
Multiple regression technique, 235 
Multiplication 
exponents and, 17 
logarithmie, 20 


Multivariable models, second-order 
conditions in, 327-29 
Mutually correlated variables, 236-37 


N 


n-person games, 452-57 
characteristic function in, 455 
coalitions in, 453 
core of, 454-55 
domination in, 455 
imputation in, 454 
side payments in, 453-54 
solution of, 455-57 
n-variable problems, constrained, 333- 
36 


Nature, games against, 459 
Negative slope, 14 
Neoclassical theory 
current, 667-69 
investment models in, 657-62 
of utility, 431-32 
cardinal utility in, 193 
Neumann-Morgenstern utility, 420-35, 
471-75 
investment risk and, 619-25 
Net compliment, 213 
Net substitute, 213 
Nominal discount rates, 601 
Nonconstant sum games, 450-52 
Nonconvex regions, 144-46 
Noncooperative games, 450 
Nondifferentiability, 52-54 
Nonlinear programming, 140-54 
algebraic notation for, 140-41 .— 
basic theorem of linear programming 
and, 149-52 
coneave and convex 
447-49 , 
convex and nonconvex regions in, 
144-46 
geometric representation in, 141—44 
nonlinear constraints, 141-42 
nonlinear objective functions, 142- 


functions in, 


44 
Kuhn-Tucker methods of, 156-76 
conditions of, 162-69 
form of Lagrangian expression, 
159-62 
methods of computation for, 152-54 
Nonnegative slack variables, 82 
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Nonsatiety of consumer, 197 
Normal form strategies, 450 
Nuti, D. M., 669 


o 


Objective functions, 9 
in Kuhn-Tucker method, 169 
nonlinear, 142-44 
Objectives of firms, 377-92 
advertising, 385-86 
alternative, 377-79 
choice of input and output combina- 
tions, 386-88 
multiple products and inputs, 382-83 
price-output determination, 383-85 
pricing and cost changes, 381-82 
profit maximization, 379-81 
satisficing and behavior analysis, 390 
Offer curve, 208, 209, 350 
Oligopoly, 394 
interdependence in, 409-12 
pricing under, 414-16 
stability of, 412-14 
Operations research, optimality anal- 
ysis in, 4-5 
Opportunity costs, 116, 601-3 
Optimal activity level, 22-23 
Optimal cash balances, 490-94 
Optimal deviations, 513-16 
Optimal financial policy, 628-32 
Optimal inventory level, 5-10 
Optimal mixed strategies, 446-49, 465 
Optimal price system, 509-10 
Optimal production level, 68 
Optimal production process, 73 
Optimality 
conditions of 
first order, 37-38 
second-order, 37-41 
constrained, 501 
in linear programming solutions, 87- 


Pareto, 561 
productive efficiency and, 501-10 
Optimization, 3-10 
economic analysis and, 5 
comparative statics with 
bordered Hessians, 333-36 
Cournot model, 322-24 
Linder theorem, 339-42 
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Optimization (cont.): 
second-order conditions in multi- 
variable models, 327-29 
Slutsky theorem for consumer, 
336-39 
Slutsky theorem 
firms, 330-33 
total differentiation, 324-26 
Orderings, 421 
lexicographical, 198-201 
Ordinal utility, 193-95 
Ordinal utility functions, 214-16 
Ordinary variables, 81 
Origin 
feasibility of, 365 
in linear program, 100 
Original constraints, 169 
Output, 267-68 
combinations of input and, 386-88 
inputs required by, 365 
levels of, 508-10 
relative, 508 
relative input levels and, 269-70 
See also Input-output analysis 


in two-input 


P 


Parameters, 319-21 
Pareto, Vilfredo, 527 
Pareto criterion, 527-28 
Pareto optimality, 561 
in distribution of goods among con- 
sumers, 503-8 
for output levels, marginal rule for, 
508-10 
productive efficiency and, 501-3 
Partial derivative, 57 E 
Partial differentiation, 57-60 
Pasinetti, L. L., 662 
Patinkin, Don, 483n 
Payments 
of dividends, 630-31 
side, 453-54 
transfer, 595 
Payoff matrices, 439, 444 
Payoffs 
equality of, 442 
expected, 426-28 
Payout period, 603-4 
Pecuniary externalities, 518n 
Pigou, A. C., 560n 


Pigou effect, 488 
Pivoting process, 90-99 
choice of column and element for, 
94—97 
special rules of, 92-94 
Planning, centralized, 512-13 
Plowback, 625 
Point elasticity, 185 
Power terms, 47 
Powers 
of exponents, 18 
logarithmic calculation of, 19 
Preferences 
continuity of, 429 
revealed 
index numbers of real income and, 
350-53 
indifference map and, 348-50 
model of, 344-47 
Slutsky theorem in n variabies 
and, 347-48 
Present value, 599-601 
discounted, 605-13 
Price-consumption curve, 208 
Price discrimination, calculus of, 416- 
i? 
Price lines, 202-4, 279-80 
Price-output determination, 383-85 
Price system, optimal, 509-10 
Prices 
changes in income and, 206-9 
dual, 565-66 
for input, 287-89 
marginal costs and optimal devi- 
ations between, 513-16 
pitfalls in determination of, 483-87 
dichotomy of pricing in real and 
monetary sectors, 485-86 
homogeneity postulate, 484-85 
Say’s identity, 486-87 
Pricing, 4 
changes in fixed costs and taxes and, 
388-89 
marginal cost and, 512 
oligopoly, 414-16 
Primal programs, 106, 118-24 
Primeaux, W. J., Jr., 415n 
Prisoner's dilemma, 452 
Probabilities 
compound, 430 
of success, 429-30 


i 


Probability theory approach to risk, 
621-23 
Process 
choice of, 298 
representation of, 301-3 
Producers 
consumers and, theory of, 353-55 
investment demands of, 648-50 
surplus for, 497-500 
Product differentiation, monopolistic 
competition with, 394, 402-4 
Product line, 73, 298 
Production 
efficiency in, 501-10 
empirical analysis of, 537 
externalities of, 517-21 
inputs and outputs and, 267-68 
relative input levels, 269-70 
optimal level of, 68 
theory of 
linear programming and, 297-317 
mathematics of, 292-96 
time requisites of, 640-41 
Production frontiers, 276-78 
Production functions, 268-69 
Cobb-Douglas, 286-87 
cost functions and, 366-69 
notation for, 275-76 
profit functions and, 368-69 
homogeneous and homothetic, 280— 
86 
Euler’s theorem, 283-84 
properties of 
diminishing returns, 270-72 
returns to scale, 272-75 
Production indifference curves 
construction of, 303-6 
properties of, 306-9 
Production possibility locus, 276, 561 
Production process, optimal, 73 
Production set, 275-76, 365-66 
definition of, 276 
Productivity, marginal, 573-77 
Products 
derivatives of, 48 
homogeneity of, 394 
multiple, 382-83 
specifications for, 73-74 
Profit functions, 368-69 
cost functions and, 364-66 
definition of, 367 
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Profit indifference curves, 309-11 
Profit maximization, 379-81, 395-98 
sample calculations of, 390-91 
Programming 
integer, 566-68 
linear, 72-103 
algebra and geometry in, 77-81 
basic solutions in, 81-99 
basic theorem of, 84-86, 149-52 
in cases where origin is not feasible 
solution, 100 
characteristics of, 74-77 
duality in, 105-38 
production theory and, 297-317 
standard problems of, 72-74 
nonlinear, 140-54 
algebraic notation for, 140—41 
concave and convex functions in, 
147-49 
convex and nonconvex regions in, 
144-46 
geometric representation in, 141— 
44 
Kuhn-Tucker method of, 156-76 
methods of computation for, 152- 
54 
Protective power of maximin strat- 
egies, 440-41 
Psychological premises in utility the- 
ory, 429-31 
Public goods, 521-22 
Pure capital rationing, 633 
Pure competition, 393-94 
freedom of entry and exit in, 394 
welfare economics and, 511-12 
Pure monopoly, 394, 401-2 
Pure public good, 521 
Pure strategies, 445 


Q 


Quantity of capital, measurement of, 
642-46 

Quasi-collusion, 451-52 

Quasi-concave utility functions, 220-23 


Raiffa, Howard, 430, 437, 458n 
Ramsey, Frank, 513 
Rankings, 421 
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Rates 
discount, 601 
of interest 
cost of capital and, 633 
diminishing returns and, 654-56 
general equilibrium model and, 
646-68 
institutional and sociological de- 
terminants of saving and, 656- 
57 
monetary theory of, 653-54 
neoclassical investment models 
and, 657-62 
producers' demand for investment 
and, 648-50 
reswitching techniques and, 662- 
67 
supply of saving and, 650-53 
of return, internal, 605 
Rationing, pure capital, 633 
Reaction curves, 414-16 
Real balance effect, 487-88 
Real discount rates, 601 
Real income, index numbers of, 350-53 
Rectangular hyperbola, 188 
Reduced-form method, 258-62 
Redundant equation, 481-83 
Relationships 
arithmetic, 23-31 
many-variable, 57-60 
simultaneous, 237-47 
estimation of equations of, 252-54 
identification of problems of, 238- 
45 
least squares bias in, 245-47 
Relative activity levels, 23 
Relative input levels, 269-70 
Relative outputs, optimal, 508 
Relative prices for input, 287-89 
Rent, 592-95 
Reorder cost, 7 
Representation of process, 301-3 
Requirements space, 299n 
Reservation demand, 587 
Resource allocation, 402, 496-97 
theorems on, 22-23 
Responsiveness, measure of, 183 
Reswitching, 662-67 
Retention of income, 630-31 
Returns 
diminishing, 270-72, 645-56 
internal rate of, 605 


Returns (cont.): 
to scale, 272-75 
constant, 273, 281 
Revealed preference 
index numbers of real income and, 
350-53 
indifference map and, 348-50 
model of, 344-47 
Slutsky theorem in n variables and, 
347-48 
Revenue functions, 364-66 
definition of, 367 
Ricardian model, 580-84 
Ricardo, David, 571, 580, 581n, 583, 592 
Risk 
in investment, 619-25 
decision theory and Neumann- 
Morgenstern utility, 624-25 
discounting for, 620-21 
finite-horizon method, 619-20 
probability theory approach, 621- 
23 
sensitivity analysis, 624 
utility theory and, 420-21 
Roots of exponents, 18 
Routing of transportation, 73 
Roy, R., 343, 365n 


Ss. 


Saddle points, see Equilibrium points 
Sales maximization, 383-85 
sample calculations of, 390-91 
Samuelson, Paul A., 343, 543, 554, 
572n, 578, 657n, 662 
Samuelson substitution theorem, 543- 
44 
Sasieni, M., 611n 
Satiation, 198-201 
Satisficing, 390 
Savage, L. J., 463, 606n 
Saving 
institutional and sociological deter- 
minants of, 656-57 
labor and, 586-89 
relationship of wealth to, 488 
supply of, 650-53 
Savings-investment parable, 654-56 
Say's identity, 486-87 
Scadding, John L., 657n 
Scale, returns to, 272-75 
constant, 273, 281 


Schelling, T. C., 451» 
Seitovsky double criterion, 529-31 
Second best, theorem of, 525 
Second-order conditions, 163 
constrained n-variable problems in, 
333-36 
of equilibrium, 191 
of minimization, 54-57 
in multivariable models, 327-29 
Securities, convertible, 628 
Sen, Amartya, 503n 
Sensitivity analysis, 624 
Series approximation of solution, 546- 
48 
Shares of stocks, 625-26 
Sharpe, W. F., 623n 
Shephard, Ronald, 343, 359n, 364 
Shephard-Uzawa duality theorem, 360 
Shephard's lemma, 360-62 
Shifting demand curves, 181-83 
Shock model, 248 
Short run, definition of, 290 
Short-run cost curves, 291-92 
Side conditions, 62 
Side payments, 453-54 
Simon, Herbert A., 390, 545 
Simon, J. L., 414n 
Simple technologies, 657-62 
Simplex method, 124-26 , 
basic theorem of linear programming 
and, 84-86 
Simultaneous relationships, 237-47 
estimation of equations of, 252-54 
maximum likelihood method, 254- 
57 
method of instrumental variables, 
262-64 
method of two-stage least squares, 
264-66 
reduced-form method, Le 
structural equations, 29/7 
identification of problems of, 238-45 
least squares bias in, 245-47 
Slack variables, 81-84 
Slope, 13-14 
Slutsky, Eugene, 211, 339 
Slutsky theorem, 209-11 
for consumer, 336-39 5 
duality theory applied to, 
in n variables, 347-48 
in two-input firm, 330-33 
Smith, Adam, 571 
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Social welfare function, 531 
Sociological determinants of saving, 
656-57 
Solution of n-person games, 455-57 
Solution space, 2997 
Specifications for products, 73-74 
Stability, 40-41 
of oligopoly, 412-14 
Standard statistical approaches to 
demand determination, 231-34 
Statics, comparative, 319-42, 488-90 
duality and, 362-64 
with optimization 
bordered Hessians, 333-36 
Cournot model, 322-24 
Linder theorem, 339-42 
second-order conditions in multi- 
variable models, 327-29 
Slutsky theorem in two-input firm, 
330-33 . 
total differentiation, 324-26 
without optimization, 321-22 
parameters and endogenous vari- 
ables and, 319-21 : 
Stationary state, 643-46 
Statistical approaches to demand de- 
termination, 231-34 
Statistics, foundations of, 475-16 
Steiner, P. O., 175n 
Stigler, George J., 414n 
Stiglitz, J. E., 666n, 667 
Stocks 
bonds vs., 628-30 
issuing new shares of, 625-26 
Straight-line marginal and average 
curves, 30-31 
Strategies 
extensive form and normal form of, 
450 
maximin, 439-40 
protective power of, 440-41 
minimax, 439-41 
mixed, 444-50, 464-66 
Strictly concave functions, 218-20 
Structural equations, 257-58 
Structural variables, 81 
Substitution, 212-13, 479-80 
diminishing marginal rate of, 197 
elasticity of, 287-89 
Substitution effects, 209-11 
Success, probability of, 429-30 
Summation index, 19 
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Sums, derivatives of, 47 
Sunk costs, 598 
Supply 
fixed, 591-92 
of saving, 650-53 
Supply curves 
competition and, 400-1 
input, backward-rising, 586-89 
Supply function, 294-95 
in general equilibrium equations, 480 
Supply jointness, 521 
Surplus 
consumers’ and producers’, 497-500 
economic rent as, 592-95 
Sweezy, Paul M., 412n 


T 


Taxes, 631-32 
changes in, 388-89 

Technological externalities, 518n 

Technologies, simple, 657-62 

Thompson, Gerald, 558n 

Threat information, 451 

Time, production process requisite for, 
640-41 


Total cost curves, 289 
Total differentiation, 60-61, 324-26 
Total inputs, 315-17 
Transactions costs, 631-32 
Transfer costs, 594 
Transfer payments, 595 
Transitivity, 197, 429 
Transportation routing, 73 
Tucker, A. W., 72n, 105n, 127,156, 452 
See also Kuhn-Tucker method 
Two-input firm, 330-33 
Two-person games 
nonconstant sum, 450-52 
zero-sum, 450-52 $ 
Two-stage least squares, 264-66 


U 


Unbiased estimates, 252-53 
Undepletability, 521 
Unions, 589-91 
Uniqueness problem, 549-51, 554-57 
Utility 
marginal, diminishing, 191 
ordinal and cardinal, 193-95 


Utility function, 356-60 
ordinal, 214-16 
quasi-concave, 220-23 
Utility theory, 420-35 
classes of measures in, 421-24 
associative measures, 421 
cardinal measures, 422-24 
orderings or rankings, 422 
construction of N-M index for, 424— 
26 
expected utility vs. expected payoff 
in, 426-28 
investment theory and, 624-25 
N-M cardinal vs. neoclassical, 431- 
32 
psychological premises behind pre- 
diction in, 429-31 
index and, 432-35 
risk and game theory and, 420-21 
Uzawa, H., 365n 


v 


Valuation of assets, 599-601 
Values 
equilibrium, 320 
present, 599-601 
discounted, 605-13 
of variables, limitations on, 52-54 
Variable cost of funds, 633 
Variables : 
endogenous, 319-21 
instrumental, 262-64 
mutually correlated, 236-37 
omission of important, 234-36 
ordinary, 81 
slack, 81-84 
Structural, 81 
values of, 52-54 
Von Neumann, John, 105n, 421, 437n, 
446, 455 
See also Neumann-Morgenstern util- 
ity 
Von Neumann model of expanding 
economy, 557-59 


w 


Wald, A., 553, 557 
Walras, Leon, 578 


Walras’ law, 481-83 
Weak assumption of revealed prefer- 
ence, 345 
Wealth, real, 487 
Wealth-saving relationship, 488 
Weingartner, H. Martin, 614n, 615n 
Welfare economics, 496-534 
activity analysis and, 559-64 
beneficial and detrimental external- 
ities in, 517-21 
breakeven constraints and optimal 
deviations in, 513-16 
centralized planning in, 512-13 
criteria for judgments in, 526-31 
integer programming and, 566-68 
"market failure" and public goods 
and, 521-22 
maximands in, 497-500 
Pareto optimality and productive 
efficiency in, 501-10 
pure competition and monopoly and, 
511-12 
resource allocation and, 496-97 


Index 695 


Welfare economies (cont.): 
theorem on democratic group deci- 

sions in, 531-34 

West, Edward, 581n 

Wicksell (author), 643 

Williamson, O. E., 175n 

Willig, Robert D., 500n 

Wolfe, Philip, 152 

Working, E. J., 237 


x 

X curves, 27-31 
Y 

Yaspan, A., 611n 
Zz 


Zero exponents, 18 

Zero-sum two-person games, 438-39 
Zeuthen-Harsanyi model, 451n 
Zone of ignorance, 349 
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